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NOVEL GLYPHOSATE N-ACETYLTRANSFERASE (GAT) GENES 



CROSS-REFERENCE TO RELATED APPLICATIONS 

5 This application claims priority to and benefit of U.S. Provisional Patent 

Application Serial No. 60/244,385 filed October 30, 2000, the disclosure of which is 
incorporated herein by reference in its entirety for all purposes. 

COPYRIGHT NOTIFICATION PURSUANT TO 37 C.F.R. § 1.71(E) 

A portion of the disclosure of this patent document contains material which 

10 is subject to copyright protection. The copyright owner has no objection to the facsimile 
reproduction by anyone of the patent document or the patent disclosure, as it appears in 
the Patent and Trademark Office patent file or records, but otherwise reserves all 
copyright rights whatsoever. 

BACKGROUND OF THE INVENTION 

15 Crop selectivity to specific herbicides can be conferred by engineering 

genes into crops which encode appropriate herbicide metabolizing enzymes. In some 
cases these enzymes, and the nucleic acids that encode them, originate in a plant. In other 
cases, they are derived from other organisms, such as microbes. See, e.g., Padgette et al. 
(1996) "New weed control opportunities: Development of soybeans with a Round UP 

20 Ready™ gene" in Herbicide-Resistant Crops (Duke, ed.), pp54-84, CRC Press, Boca 

Raton; and Vasil (1996) "Phosphinothricin-resistant crops" in Herbicide-Resistant Crops 
(Duke, ed.), pp85-91. Indeed, transgenic plants have been engineered to express a variety 
of herbicide tolerance/metabolizing genes, from a variety of organisms. For example, 
acetohydroxy acid synthase, which has been found to make plants that express this 

25 enzyme resistant to multiple types of herbicides, has been introduced into a variety of 
plants (see, e.g., Hattori et al. (1995) Mol Gen Genet 246:419. Other genes that confer 
tolerance to herbicides include: a gene encoding a chimeric protein of rat cytochrome 
P4507A1 and yeast NADPH-cytochrome P450 oxidoreductase (Shiota et al. (1994) Plant 
PhvsiolPlant Phvsiol 106:17), genes for glutathione reductase and superoxide dismutase 

30 (Aono et al. (1995) Plant Cell Phvsiol 36: 1687, and genes for various phosphotransferases 
(Datta et al. (1992) Plant Mol Biol 20:619. 
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One herbicide which is the subject of much investigation in this regard is 
N-phosphonomethylglycine, commonly referred to as glyphosate. Glyphosate is the top 
selling herbicide in the world, with sales projected to reach $5 billion by 2003. It is a 
broad spectrum herbicide that kills both broadleaf and grass-type plants. A successful 
5 mode of commercial level glyphosate resistance in transgenic plants is by introduction of a 
modified Agrobacterium CP4 5-enolpyruvylshikimate-3-phosphate synthase (hereinafter 
referred to as EPSP synthase or EPSPS) gene. The transgene is targeted to the chloroplast 
where it is capable of continuing to synthesize EPSP from phosphoenolpyruvic acid 
(PEP) and shikimate-3-phosphate in the presence of glyphosate. In contrast, the native 

10 EPSP synthase is inhibited by glyphosate. Without the transgene, plants sprayed with 
glyphosate quickly die due to inhibition of EPSP synthase which halts the downstream 
pathway needed for aromatic amino acid, hormone, and vitamin biosynthesis. The CP4 
glyphosate-resistant soybean transgenic plants are marketed, e.g., by Monsanto under the 
name "Round UP Ready™." 

15 In the environment, the predominant mechanism by which glyphosate is 

degraded is through soil microflora metabolism. The primary metabolite of glyphosate in 
soil has been identified as aminomethylphosphonic acid (AMP A), which is ultimately 
converted into ammonia, phosphate and carbon dioxide. The proposed metabolic scheme 
that describes the degradation of glyphosate in soil through the AMPA pathway is shown 

20 in Fig. 8. An alternative metabolic pathway for the breakdown of glyphosate by certain 
soil bacteria, the sarcosine pathway, occurs via initial cleavage of the C-P bond to give 
inorganic phosphate and sarcosine, as depicted in Fig. 9. 

Another successful herbicide/transgenic crop package is glufosinate 
(phosphinothricin) and the LibertyLink™ trait marketed, e.g., by Aventis. Glufosinate is 

25 also a broad spectrum herbicide. Its target is the glutamate synthase enzyme of the 

chloroplast. Resistant plants carry the bar gene from Streptomyces hygroscopicus and 
achieve resistance by the N-acetylation activity of bar, which modifies and detoxifies 
glufosinate. 

An enzyme capable of acetylating the primary amine of AMPA is reported 
30 in PCT Application No. WO00/29596. The enzyme was not described as being able to 
acetylate a compound with a secondary amine (e.g., glyphosate). 

While a variety of herbicide resistance strategies are available as noted 
above, aditional approaches would have considerable commercial value. The present 
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invention provides, e.g., novel polynucleotides and polypeptides for conferring herbicide 
tolerance, as well as numerous other benefits as will become apparent during review of the 
disclosure. 

SUMMARY OF THE INVENTION 

5 It is an object of the present invention to provide methods and reagents for 

rendering an organism, such as a plant, resistant to glyphosate. This and other objects of 
the invention are provided by one or more of the embodiments described below. 

One embodiment of the invention provides novel polypeptides referred to 
herein as GAT polypeptides. GAT polypeptides are characterized by their structural 

10 similarity to one another, e.g., in terms of sequence similarity when the GAT polypeptides 
are aligned with one another. Some GAT polypeptides possess glyphosate N-acetyl 
transferase activity, i.e., the ability to catalyze the acetylation of glyphosate. Some GAT 
polypeptides are also capable of catalyzing the acetylation of glyphosate analogs and or 
glyphosate metabolites, e.g., aminomethylphosphonic acid. 

15 Also provided are novel polynucleotides referred to herein as GAT 

polynucleotides. GAT polynucleotides are characterized by their ability to encode GAT 
polypeptides. In some embodiments of the invention, a GAT polynucleotide is engineered 
for better plant expression by replacing one or more parental codons with a synonymous 
codon that is preferentially used in plants relative to the parental codon. In other 

20 embodiments, a GAT polynucleotide is modified by the introduction of a nucleotide 
sequence encoding an N-terminal chloroplast transit peptide. 

GAT polypeptides, GAT polynucleotides and glyphosate N-acetyl 
transferase activity are described in more detail below. The invention further includes 
certain fragments of the GAT polypeptides and GAT polynucleotides described herein. 

25 The invention includes non-native variants of the polypeptides and 

polynucleotides described herein, wherein one or more amino acids of the encoded 
polypeptide have been mutated. 

The invention further provides a nucleic acid construct comprising a 
polynucleotide of the invention. The construct can be a vector, such as a plant 

30 transformation vector. In some aspects a vector of the invention will comprise a T-DNA 
sequence. The construct can optionally include a regulatory sequence (e.g., a promoter) 
operably linked to a GAT polynucleotide, where the promoter is heterologous with 
respect to the polynucleotide and effective to cause sufficient expression of the encoded 
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polypeptide to enhance the glyphosate tolerance of a plant cell transformed with the 
nucleic acid construct. 

In some aspects of the invention, a GAT polynucleotide functions as a 
selectable marker, e.g., in a plant, bacteria, actinomycetes, yeast, algae or other fungi. For 
5 example, an organism that has been transformed with a vector including a GAT 
polynucleotide selectable marker can be selected based on its ability to grow in the 
presence of glyphosate. A GAT marker gene can be used for selection or screening for 
transformed cells expressing the gene. 

The invention further provides vectors with stacked traits, i.e., vectors that 

10 encode a GAT and that also include a second polynucleotide sequence encoding a second 
polypeptide that confers a detectable phenotypic trait upon a cell or organism expressing 
the second polypeptide at an effective level. The detectable phenotypic trait can function 
as a selectable marker, e.g, by conferring herbicide resistance, pest resistance, or providing 
some sort of visible marker. 

15 In one embodiment, the invention provides a composition comprising two 

or more polynucleotides of the invention. 

Compositions containing two or more GAT polynucleotides or encoded 
polypeptides are a feature of the invention. In some cases, these compositions are libraries 
of nucleic acids containing, e.g., at least 3 or more such nucleic acids. Compositions 

20 produced by digesting the nucleic acids of the invention with a restriction endonuclease, a 
DNAse or an RNAse, or otherwise fragmenting the nucleic acids, e.g., mechanical 
shearing, chemical cleavage, etc., are also a feature of the invention, as are compositions 
produced by incubating a nucleic acid of the invention with deoxyribonucleotide 
triphosphates and a nucleic acid polymerase, such as a thermostable nucleic acid 

25 polymerase. 

Cells transduced by a vector of the invention, or which otherwise 
incorporate the nucleic acid of the invention, are an aspect of the invention. In a preferred 
embodiment, the cells express a polypeptide encoded by the nucleic acid. 

In some embodiments, the cells incorporating the nucleic acids of the 
30 invention are plant cells. Transgenic plants, transgenic plant cells and transgenic plant 

explants incorporating the nucleic acids of the invention are also a feature of the invention. 
In some embodiments, the transgenic plants, trangenic plant cells or transgenic plant 
explants express an exogenous polypeptide with glyphosate N-acetyltransferase activity 



-4- 



WO 02/36782 



PCT/USO 1/46227 



encoded by the nucleic acid of the invention. The invention also provides transgenic seeds 
produced by the transgenic plants of the invention. 

The invention further provides transgenic plants or transgenic plant 
explants having enhanced tolerance to glyphosate due to the expression of a polypeptide 
5 with glyphosate N-acetyltransferase activity and a polypeptide that imparts glyphosate 
tolerance by another mechanism, such as, a glyphosate-tolerant 5-enolpyruvylshikimate-3- 
phosphate synthase and/or a glyphosate-tolerant glyphosate oxido-reductase. In a further 
embodiment, the invention provides transgenic plants or transgenic plant explants having 
enhanced tolerance to glyphosate, as well as tolerance to an additional herbicide due to the 

10 expression of a polypeptide with glyphosate N-acetyltransferase activity, a polypeptide 
that imparts glyphosate tolerance by another mechanism, such as, a glyphosate-tolerant 5- 
enolpyruvylshikimate-3 -phosphate synthase and/or a glyphosate-tolerant glyphosate 
oxido-reductase and a polypeptide imparting tolerance to the additional herbicide, such as, 
a mutated hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate 

15 synthase, a sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant 
acetolactate synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a 
phosphinothricin acetyl transferase and a mutated protoporphyrinogen oxidase. 

The invention also provides transgenic plants or transgenic plant explants 
having enhanced tolerance to glyphosate, as well as tolerance to an additional herbicide 

20 due to the expression of a polypeptide with glyphosate N-acetyltransferase activity and a 
polypeptide imparting tolerance to the additional herbicide, such as, a mutated 
hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate synthase, a 
sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant acetolactate 
synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a phosphinothricin acetyl 

25 transferase and a mutated protoporphyrinogen oxidase. 

Methods of producing the polypeptides of the invention by introducing the 
nucleic acids encoding them into cells and then expressing and recovering them from the 
cells or culture medium are a feature of the invention. In preferred embodiments, the cells 
expressing the polypeptides of the invention are transgenic plant cells. 

30 Polypeptides that are specifically bound by a polyclonal antisera that reacts 

against an antigen derived from SEQ ID NOS:6-10 and 263-514, but not to a naturally 
occuring related sequence, e.g., such as a peptide represented by a subsequence of 
GenBank accession number CAA70664, as well as antibodies which are produced by 
administering an antigen derived from any one or more of SEQ ID NOS:6-10 and 263-514 
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and/or which bind specifically to such antigens and which do not specifically bind to a 
naturally pccuring polypeptide corresponding to GenBank accession number CAA70664, 
are all features of the invention. 

Another aspect of the invention relates to methods of polynucleotide 
5 diversification to produce novel GAT polynucleotides and polypeptides by recombining or 
mutating the nucleic acids of the invention in vitro or in vivo. In an embodiment, the 
recombination produces at least one library of recombinant GAT polynucleotides. The 
libraries so produced are embodiments of the invention, as are cells comprising the 
libraries. Furthermore, methods of producing a modified GAT polynucleotide by mutating 

10 a nucleic acid of the invention are embodiments of the invention. Recombinant and 

mutant GAT polynucleotides and polypeptides produced by the methods of the invention 
are also embodiments of the invention. 

In some aspects of the invention, diversification is achieved by using 
recursive recombination, which can be accomplished in vitro, in vivo, in silico, or a 

15 combination thereof. Some examples of diversification methods described in more detail 
below are family shuffling methods and synthetic shuffling methods. 

The invention provides methods for producing a glyphosate resistant 
transgenic plant or plant cell that involve transforming a plant or plant cell with a 
polynucleotide encoding a glyphosate N-acetyltransferase, and optionally regenerating a 

20 transgenic plant from the transformed plant cell. In some aspects the polynucleotide is a 
GAT polynucleotide, optionally a GAT polynucleotide derived from a bacterial source. 
In some aspects of the invention, the method can comprise growing the transformed plant 
or plant cell in a concentration of glyphosate that inhibits the growth of a wild-type plant 
of the same species without inhibiting the growth of the transformed plant. The method 

25 can comprise growing the transformed plant or plant cell or progeny of the plant or plant 
cell in increasing concentrations of glyphosate and/or in a concentration of glyphosate that 
is lethal to a wild-type plant or plant cell of the same species. 

A glyphosate resistant transgenic plant produced by this method can be 
propagated, for example by crossing it with a second plant, such that at least some progeny 

30 of the cross display glyphosate tolerance. 

The invention further provides methods for selectively controlling weeds in 
a field containing a crop that involve planting the field with crop seeds or plants which are 
glyphosate-tolerant as a result of being transformed with a gene encoding a glyphosate N- 
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acteyltransferase, and applying to the crop and weeds in the field a sufficient amount of 
glyphosate to control the weeds without significantly affecting the crop. 

The invention further provides methods for controlling weeds in a field and 
preventing the emergence of glyphosate resistant weeds in a field containing a crop which 
5 involve planting the field with crop seeds or plants that are glyphosate tolerant as a result 
of being transformed with a gene encoding a glyphosate N-acetyltransferase and a gene 
encoding a polypeptide imparting glyphosate tolerance by another mechanism, such as, a 
glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase and/or a glyphosate- 
tolerant glyphosate oxido-reductase and applying to the crop and the weeds in the field a 
10 sufficient amount of glyphosate to control the weeds without significantly affecting the 
crop. 

In a further embodiment the invention provides methods for controlling 
weeds in a field and preventing the emergence of herbicide resistant weeds in a field 
containing a crop which involve planting the field with crop seeds or plants that are 

15 glyphosate tolerant as a result of being transformed with a gene encoding a glyphosate N- 
acetyltransferase, a gene encoding a polypeptide imparting glyphosate tolerance by 
another mechanism, such as, a glyphosate-tolerant 5-enolpyravylshikimate-3-phosphate 
synthase and/or a glyphosate-tolerant glyphosate oxido-reductase and a gene encoding a 
polypeptide imparting tolerance to an additional herbicide, such as, a mutated 

20 hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate synthase, a 

sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant acetolactate 
synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a phosphinothricin acetyl 
transferase and a mutated protoporphyrinogen oxidase and applying to the crop and the 
weeds in the field a sufficient amount of glyphosate and an additional herbicide, such as, a 

25 hydroxyphenylpyruvatedioxygenase inhibitor, sulfonamide, imidazolinone, bialaphos, 

phosphinothricin, azafenidin, butafenacil, sulfosate, glufosinate, and a protox inhibitor to 
control the weeds without significantly affecting the crop. 

The invention further provides methods for controlling weeds in a field and 
preventing the emergence of herbicide resistant weeds in a field containing a crop which 

30 involve planting the field with crop seeds or plants that are glyphosate tolerant as a result 
of being transformed with a gene encoding a glyphosate N-acetyltransferase and a gene 
encoding a polypeptide imparting tolerance to an additional herbicide, such as, a mutated 
hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate synthase, a 
sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant acetolactate 
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synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a phosphinothricin acetyl 
transferase and a mutated protoporphyrinogen oxidase and applying to the crop and the 
weeds in the field a sufficient amount of glyphosate and an additional herbicide, such as, a 
hydroxyphenylpyruvatedioxygenase inhibitor, sulfonamide, imidazolinone, bialaphos, 
5 phosphinothricin, azafenidin, butafenacil, sulfosate, glufosinate, and a protox inhibitor to 
control the weeds without significantly affecting the crop. 

The invention further provides methods for producing a genetically 
transformed plant that is tolerant toward glyphosate that involve inserting into the genome 
of a plant cell a recombinant, double-stranded DNA molecule comprising: (i) a promoter 

10 which functions in plant cells to cause the production of an RNA sequence;(ii) a structural 
DNA sequence that causes the production of an RNA sequence which encodes a GAT; 
and (iii) a 3' non-translated region which functions in plant cells to cause the addition of a 
stretch of polyadenyl nucleotides to the 3' end of the RNA sequence; 
where the promoter is heterologous with respect to the structural DNA sequence and 

15 adapted to cause sufficient expression of the encoded polypeptide to enhance the 

glyphosate tolerance of a plant cell transformed with the DNA molecule; obtaining a 
transformed plant cell; and regenerating from the transformed plant cell a genetically 
transformed plant which has increased tolerance to glyphosate. 

The invention further provides methods for producing a crop that involve 

20 growing a crop plant that is glyphosate-tolerant as a result of being transformed with a 

gene encoding a glyphosate N-acteyltransferase, under conditions such that the crop plant 
produces a crop; and harvesting a crop from the crop plant. These methods often include 
applying glyphosate to the crop plant at a concentration effective to control weeds. 
Exemplary crop plants include cotton, corn, and soybean. 

25 The invention also provides computers, computer readable medium and 

integrated systems, including databases that are composed of sequence records including 
character strings corresponding to SEQ ID NOs:l-514. Such integrated systems 
optionally include, one or more instruction set for selecting, aligning, translating,reverse- 
translating or viewing any one or more character strings corresponding to SEQ ID NOs:l- 

30 514, with each other and/or with any additional nucleic acid or amino acid sequence. 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 depicts the N-acetylation of glyphosate catalyzed by a glyphosate- 

N-acetyltransferase ("GAT"). 
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Figure 2 illustrates mass spectroscopic detection of N-acetylglyphosate 
produced by an exemplary Bacillus culture expressing a native GAT activity. 

Figure 3 is a table illustrating the relative identity between GAT sequences 
isolated from different strains of bacteria and yitl from Bacillus subtilis. 
5 Figure 4 is a map of the plasmid pMAXY2120 for expression and 

purification of the GAT enzyme from E. coli cultures. 

Figure 5 is a mass spectrometry output showing increased N- 
acetylglyphosate production over time in a typical GAT enzyme -reaction mix. 

Figure 6 is a plot of the kinetic data of a GAT enzyme from which a Km of 
10 2.9 mM for glyphosate was calculated. 

Figure 7 is a plot of the kinetic data taken from the data of Figure 6 from 
which a Km of 2 was calculated for Acetyl CoA. 

Figure 8 is a scheme that describes the degradation of glyphosate in soil 
through the AMPA pathway. 
15 Figure 9 is a scheme that describes the sarcosine pathway of glyphosate 

degradation. 

Figure 10 is the BLOSUM62 matrix. 
Figure 11 is a map of the plasmid pMAXY2190. 
Figure 12 depicts a T-DNA construct with gat selectable marker. 
20 Figure 13 depicts a yeast expression vector with gat selectable marker. 

DETAILED DISCUSSION 

The present invention relates to a novel class of enzymes exhibiting N- 

acetyltransferase activity. In one aspect, the invention relates to a novel class of enzymes 
capable of acetylating glyphosate and glyphosate analogs, e.g., enzymes possessing 

25 glyphosate N-acetyltransferase ("GAT") activity. Such enzymes are characterized by the 
ability to acetylate the secondary amine of a compound. In some aspects of the invention, 
the compound is a herbicide, e.g., glyphosate, as illustrated schematically in Figure 1. The 
compound can also be a glyphosate analog or a metabolic product of glyphosate 
degradation, e.g, aminomethylphosphonic acid. Although the acetylation of glyphosate is 

30 a key catalytic step in one metabolic pathway for catabolism of glyphosate, the enzymatic 
acetylation of glyphosate by natoaUy-occurring, isolated, or recombinant enzymes has not 
been previously described. Thus, the nucleic acids and polypeptides of the invention 
provide a new biochemical pathway for engineering herbicide resistance. 
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In one aspect, the invention provides novel genes encoding GAT 
polypeptides. Isolated and recombinant GAT polynucleotides corresponding to naturally 
occurring polynucleotides, as well as recombinant and engineered, e.g., diversified, GAT 
polynucleotides are a feature of the invention. GAT polynucleotides are exemplified by 
5 SEQ ID NOS: 1-5 and 11-262. Specific GAT polynucleotide and polypeptide sequences 
are provided as examples to help illustrate the invention, and are not intended to limit the 
scope of the genus of GAT polynucleotides and polypeptides described and/or claimed 
herein. 

The invention also provides methods for generating and selecting 
10 diversified libraries to produce additional GAT polynucleotides, including polynucleotides 
encoding GAT polypeptides with improved and/or enhanced characteristics, e.g., altered 
Km for glyphosate, increased rate of catalysis, increased stability, etc., based upon 
selection of a polynucleotide constituent of the library for the new or improved activities 
described herein. Such polynucleotides are especially favorably employed in the 
15 production of glyphosate resistant transgenic plants. 

The GAT polypeptides of the invention exhibit a novel enzymatic activity. 
Specifically, the enzymatic acetylation of the synthetic herbicide glyphosate has not been 
recognized prior to the present invention. Thus, the polypeptides herein described, e.g., as 
exemplified by SEQ ID NOS: 6-10 and 263-514, define a novel biochemical pathway for 
20 the detoxification of glyphosate that is functional in vivo, e.g., in plants. 

Accordingly, the nucleic acids and polypeptides of the invention are of 
significant utility in the generation of glyphosate resistant plants by providing new nucleic 
acids, polypeptides and biochemical pathways for the engineering of herbicide selectivity 
in transgenic plants. 

25 DEFINITIONS 

Before describing the present invention in detail, it is to be understood that 
this invention is not limited to particular compositions or biological systems, which can, of 
course, vary. It is also to be understood that the terminology used herein is for the purpose 
- of describing particular embodiments only, and is not intended to be limiting. As used in 
30 this specification and the appended claims, the singular forms "a", "an" and "the" include 
plural referents unless the content clearly dictates otherwise. Thus, for example, reference 
to "a device" includes a combination of 'two or more such devices, reference to "a gene 
fusion construct" includes mixtures of constructs, and the like. 

-10- 
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Unless defined otherwise, all technical and scientific terms used herein 
have the same meaning as commonly understood by one of ordinary skill in the art to 
which the invention pertains. Although any methods and materials similar or equivalent to 
those described herein can be used in the practice for testing of the present invention, 
5 specific examples of appropriate materials and methods are described herein. 

In describing and claiming the present invention, the following terminology 
will be used in accordance with the definitions set out below. 

For purposes of the present invention, the term "glyphosate" should be 
considered to include any herbicidally effective form of N-phosphonomethylglycine 

10 (including any salt thereof) and other forms which result in the production of the 

glyphosate anion in planta. The term "glyphosate analog" refers to any structural analog 
of glyphostate that has the ability to inhibit EPSPS at levels such that the glyphosate 
analog is herbicidally effective. 

As used herein, the term "glyphosate-N-acetyltransferase activity" or "GAT 

15 activity" refers to the ability to catalyze the acetylation of the secondary amine group of 
glyphosate, as illustrated, for example, in Figure 1. A "glyphosate -N-acetyltransferase" 
or "GAT" is an enzyme that catalyzes the acetylation of the amine group of glyphosate, a 
glyphosate analog, and/or a glyphosate primary metabolite (i.e., AMP A or sarcosine). In 
some preferred embodiments of the invention, a GAT is able to transfer the acetyl group 

20 from AcetylCoA to the secondary amine of glyphosate and the primary amine of AMPA. 
The exemplary GATs described herein are active from pH 5-9, with optimal activity in the 
range of pH 6.5-8.0. Activity can be quantified using various kinetic parameters well 
know in the art, e.g., kc at , K M> and kc a t/ Km. These kinetic parameters can be determined as 
described below in Example 7. 

25 The terms "polynucleotide," "nucleotide sequence," and "nucleic acid" are 

used to refer to a polymer of nucleotides (A,C,T,U,G, etc. or naturally occurring or 
artificial nucleotide analogues), e.g., DNA or RNA, or a representation thereof, e.g., a 
character string, etc, depending on the relevant context. A given polynucleotide or 
complementary polynucleotide can be determined from any specified nucleotide sequence. 

30 Similarly, an "amino acid sequence" is a polymer of amino acids (a protein, 

polypeptide, etc.) or a character string representing an amino acid polymer, depending on 
context. The terms "protein," "polypeptide " and "peptide" are used interchangeably 
herein. 
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A polynucleotide, polypeptide or other component is "isolated" when it is 
partially or completely separated from components with which it is normally associated 
(other proteins, nucleic acids, cells, synthetic reagents, etc.). A nucleic acid or polypeptide 
is "recombinant" when it is artificial or engineered, or derived from an artificial or 
5 engineered protein or nucleic acid. For example, a polynucleotide that is inserted into a 
vector or any other heterologous location, e.g, in a genome of a recombinant organism, 
such that it is not associated with nucleotide sequences that normally flank the 
polynucleotide as it is found in nature is a recombinant polynucleotide. A protein 
expressed in vitro or in vivo from a recombinant polynucleotide is an example of a 
10 recombinant polypeptide. Likewise, a polynucleotide sequence that does not appear in 
nature, for example a variant of a naturally occurring gene, is recombinant. 

The terms "glyphosate N-acetyl transferase polypeptide" and "GAT 
polypeptide" are used interchangeably to refer to any of a family of novel polypeptides 
provided herein. 

15 The terms "glyphosate N-acetyl transferase polynucleotide" and "GAT 

polynucleotide" are used interchangeably to refer to a polynucleotide that encodes a GAT 
polypeptide. 

A "subsequence" or "fragment" is any portion of an entire sequence. 

Numbering of an amino acid or nucleotide polymer corresponds to 
20 numbering of a selected amino acid polymer or nucleic acid when the position of a given 
monomer component (amino acid residue, incorporated nucleotide, etc.) of the polymer 
corresponds to the same residue position in a selected reference polypeptide or 
polynucleotide. 

A vector is a composition for facilitating cell transduction by a selected 
25 nucleic acid, or expression of the nucleic acid in the cell. Vectors include, e.g., plasmids, 
cosmids, viruses, YACs, bacteria, poly-lysine, chromosome integration vectors, episomal 
vectors, etc. 

"Substantially an entire length of a polynucleotide or amino acid sequence" 
refers to at least about 70%, generally at least about 80%, or typically about 90% or more 
30 of a sequence. 

As used herein, an "antibody" refers to a protein comprising one or more 

polypeptides substantially or partially encoded by immunoglobulin genes or fragments of 

immunoglobulin genes. The recognized immunoglobulin genes include the kappa, 

lambda, alpha, gamma, delta, epsilon and mu constant region genes, as well as myriad 
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immunoglobulin variable region genes. light chains are classified as either kappa or 
lambda. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn 
define the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively. A typical 
immunoglobulin (antibody) structural unit comprises a tetramer. Each tetramer is 
5 composed of two identical pairs of polypeptide chains, each pair having one "light" (about 
25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each chain defines a 
variable region of about 100 to 110 or more amino acids primarily responsible for antigen 
recognition. The terms variable light chain (VL) and variable heavy chain (VH) refer to 
these light and heavy chains respectively. Antibodies exist as intact immunoglobulins or 

10 as a number of well characterized fragments produced by digestion with various 

peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages in 
the hinge region to produce F(ab)'2, a dimer of Fab which itself is a light chain joined to 
VH-CH1 by a disulfide bond. The F(ab)'2 may be reduced under mild conditions to break 
the disulfide linkage in the hinge region thereby converting the (Fab')2 dimer into an Fab 1 

15 monomer. The Fab' monomer is essentially an Fab with part of the hinge region (see, 
Fundamental Immunology , 4 th Edition,W.E. Paul (ed.), Raven Press, N. Y. (1998), for a 
more detailed description of other antibody fragments). While various antibody fragments 
are defined in terms of the digestion of an intact antibody, one of skill will appreciate that 
such Fab' fragments may be synthesized de novo either chemically or by utilizing 

20 recombinant DNA methodology. Thus, the term antibody, as used herein also includes 
antibody fragments either produced by the modification of whole antibodies or 
synthesized de novo using recombinant DNA methodologies. Antibodies include single 
chain antibodies, including single chain Fv (sFv) antibodies in which a variable heavy and 
a variable light chain are joined together (directly or through a peptide linker) to form a 

25 continuous polypeptide. 

A "chloroplast transit peptide" is an amino acid sequence which is 
translated in conjunction with a protein and directs the protein to the chloroplast or other 
plastid types present in the cell in which the protein is made. "Chloroplast transit 
sequence" refers to a nucleotide sequence that encodes a chloroplast transit peptide. 

30 A "signal peptide" is an amino acid sequence which is translated in 

conjunction with a protein and directs the protein to the secretory system (Chrispeels, J. J., 

(1991) Ann. Rev. Plant Phys. Plant Mol. Biol. 42:21-53). If the protein is to be directed to 

a vacuole, a vacuolar targeting signal (supra) can further be added, or if to the endoplasmic 

reticulum, an endoplasmic reticulum retention signal (supra) may be added. If the protein 
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is to be directed to the nucleus, any signal peptide present should be removed and instead a 
nuclear localization signal included (Raikhel, N. (1992) Plant Phys. 100:1627-1632). 

The terms "diversification" and "diversity/* as applied to a polynucleotide, 
refers to generation of a plurality of modified forms of a parental polynucleotide, or 

5 plurality of parental polynucleotides. In the case where the polynucleotide encodes a 
polypeptide, diversity in the nucleotide sequence of the polynucleotide can result in 
diversity in the corresponding encoded polypeptide, e.g. a diverse pool of polynucleotides 
encoding a plurality of polypeptide variants. In some embodiments of the invention, this 
sequence diversity is exploited by screening/selecting a library of diversified 

10 polynucleotides for variants with desirable functional attributes, e.g., a polynucleotide 
encoding a GAT polypeptide with enhanced functional characteristics. 

The term "encoding" refers to the ability of a nucleotide sequence to code 
for one or more amino acids. The term does not require a start or stop codon. An amino 
acid sequence can be encoded in any one of six different reading frames provided by a 

15 polynucleotide sequence and its complement. 

When used herein, the term "artificial variant" refers to a polypeptide 
having GAT activity, which is encoded by a modified GAT polynucleotide, e.g., a 
modified form of any one of SEQ ID NOS: 1-5 and 1 1-262, or of a naturally-occurring 
GAT polynucleotide isolated from an organism. The modified polynucleotide, from 

20 which an artificial variant is produced when expressed in a suitable host, is obtained 
through human intervention by modification of a GAT polynucleotide. 

The term "nucleic acid construct" or "polynucleotide construct" means a 
nucleic acid molecule, either single^ or double-stranded, which is isolated from a naturally 
occurring gene or which has been modified to contain segments of nucleic acids in a 

25 manner that would not otherwise exist in nature. The term nucleic acid construct is 

synonymous with the term "expression cassette" when the nucleic acid construct contains 
the control sequences required for expression of a coding sequence of the present 
invention. 

The term "control sequences" is defined herein to include all components, 

30 which are necessary or advantageous for the expression of a polypeptide of the present 

invention. Each control sequence may be native or foreign to the nucleotide sequence 

encoding the polypeptide. Such control sequences include, but are not limited to, a leader, 

polyadenylation sequence, propeptide sequence, promoter, signal peptide sequence, and 

transcription terminator. At a minimum, the control sequences include a promoter, and 
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transcriptional and translational stop signals. The control sequences may be provided with 
linkers for the purpose of introducing specific restriction sites facilitating ligation of the 
control sequences with the coding region of the nucleotide sequence encoding a 
polypeptide. 

5 The term "operably linked" is defined herein as a configuration in which a 

control sequence is appropriately placed at a position relative to the coding sequence of 
the DNA sequence such that the control sequence directs the expression of a polypeptide. 

When used herein the term "coding sequence" is intended to cover a 
nucleotide sequence, which directly specifies the amino acid sequence of its protein 
10 product. The boundaries of the coding sequence are generally determined by an open 
reading frame, which usually begins with the ATG start codon. The coding sequence 
typically includes a DNA, cDNA, and/or recombinant nucleotide sequence. 

In the present context, the term "expression" includes any step involved in 
the production of the polypeptide including, but not limited to, transcription, post- 
15 transcriptional modification, translation, post-translational modification, and secretion. 

In the present context, the term "expression vector" covers a DNA 
molecule,, linear or circular, that comprises a segment encoding a polypeptide of the 
invention, and which is operably linked to additional segments that provide for its 
transcription. 

20 The term "host cell", as used herein, includes any cell type which is 

susceptible to transformation with a nucleic acid construct. 

The term "plant" includes whole plants, shoot vegetative organs/structures 
(e.g. leaves, stems and tubers), roots, flowers and floral organs/structures (e.g. bracts, 
sepals, petals, stamens, carpels, anthers and ovules), seed (including embryo, endosperm, 

25 and seed coat) and fruit (the mature ovary), plant tissue (e.g. vascular tissue, ground tissue, 
and the like) and cells (e.g. guard cells, egg cells, trichomes and the like), and progeny of 
same. The class of plants that can be used in the method of the invention is generally as 
broad as the class of higher and lower plants amenable to transformation techniques, 
including angiosperms (monocotyledonous and dicotyledonous plants), gymnosperms, 

30 ferns, and multicellular algae. It includes plants of a variety of ploidy levels, including 
aneuploid, polyploid, diploid, haploid and hemizygous. 

The term "heterologous" as used herein describes a relationship between 
two or more elements which indicates that the elemennts are not normally found in 
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proximity to one another in nature. Thus, for example, a polynucleotide sequence is 
"heterologous to" an organism or a second polynucleotide sequence if it originates from a 
foreign species, or, if from the same species, is modified from its original form. For 
example, a promoter operably linked to a heterologous coding sequence refers to a coding 
sequence from a species different from that from which the promoter was derived, or, if 
from the same species, a coding sequence which is not naturally associated with the 
promoter (e.g. a genetically engineered coding sequence or an allele from a different 
ecotype or variety). An example of a heterologous polypeptide is a polypeptide expressed 
from a recombinant polynucleotide in a transgenic organism. Heterologous 
polynucleotides and polypeptides are forms of recombinant molecules. 

A variety of additional terms are defined or otherwise characterized herein. 
GLYPHOS ATE N-ACETYLTRANSFERASES 

In one aspect, the invention provides a novel family of isolated or 
recombinant enzymes referred to herein as "glyphosate N-acetyltransferases," "GATs 
or "GAT enzymes." GATs are enzymes that have GAT activity, preferably sufficient 
activity to confer some degree of glyphosate tolerance upon a transgenic plant engineered 
to express the GAT. Some examples of GATs include GAT polypeptides, described in 
more detail below. 

Of course, GAT-mediated glyphosate tolerance is a complex function of 
GAT activity, GAT expression levels in the transgenic plant, the particular plant, the 
nature and timing of herbicide application, etc. One of skill in the art can determine 
without undue experimentation the level of GAT activity required to effect glyphosate 
tolerance in a particular context. 

GAT activity can be characterized using the conventional kinetic 
parameters kcat, K M , and kcat / K M . kcat can be thought of as a measure of the rate of 
acetylation, particularly at high substrate concentrations, K M is a measure of the affinity of 
the GAT for its substrates (e.g., Acetyl CoA and glyphosate), and kc at / K M is a measure of 
catalytic efficiency that takes both substrate affinity and catalytic rate into account - this 
parameter is particularly important in the situation where the concentration of a substrate 
is at least partially rate limiting. In general, a GAT with a higher kcat or kcat / K M is a more 
efficient catalyst than another GAT with lower kcat or kc at / Km. A GAT with a lower K M is 
a more efficient catalyst than another GAT with a higher K M . Thus, to determine whether 
one GAT is more effective than another, one can compare kinetic parameters for the two 

enzymes. The relative importance of kc at , kc at / Km and K M will vary depending upon the 
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context in which the GAT will be expected to function, e.g., the anticipated effective 
concentration of glyphosate relative to K M for glyphosate. GAT activity can also be 
characterized in terms of any of a number of functional characteristics, e.g., stability, 
susceptibility to inhibition or activation by other molecules, etc. 

GLYPHOSATE N-ACETYLTRANSFERASE POLYPEPTIDES 

In one aspect, the invention provides a novel family of isolated or 

recombinant polypeptides referred to herein as "glyphosate N-acetyltransferase 

polypeptides" or "GAT polypeptides." GAT polypeptides are characterized by their 

structural similarity to a novel family of GATs. Many but not all GAT polypeptides are 

GATs. The distinction is that GATs are defined in terms of function, whereas GAT 

polypeptides are defined in terms of structure. A subset of the GAT polypeptides consists 

of those GAT polypeptides that have GAT activity, preferably at a level that will function 

to confer glyphosate resistance upon a transgenic plant expressing the protein at an 

effective level. Some preferred GAT polypeptides for use in conferring glyphosate 

tolerance have a kcat of at least 1 min" 1 , or more preferably at least 10 min" 1 , 100 min 1 or 

1000 min" 1 . Other preferred GAT polypeptides for use in conferring glyphosate tolerance 

have a K M no greater than 100 mM, or more preferably no greater than 10 mM, 1 mM, or 

0.1 mM. Still other preferred GAT polypeptides for use in conferring glyphosate tolerance 

have a W K M of at least 1 mM" 1 min 1 or more, preferably at least 10 mM^mm 1 , 100 

mM^min" 1 , 1000 mM" 1 min' 1 , or 10,000 mM^min" 1 . 

Exemplary GAT polypeptides have been isolated and characterized from a 
variety of bacterial strains. One example of a monomelic GAT polypeptide that has been 
isolated and characterized has a molecular radius of approximately 17 kD. An exemplary 
GAT enzyme isolated from a strain of B, licheniformis, SEQ ID NO:7, exhibits a Km for 
glyphosate of approximately 2.9 mM and a Km for acetyl CoA of approximately 2 \xM, 
with a kcat equal to 6/minute. 

The term "GAT polypeptide" refers to any polypeptide comprising an 

amino acid sequence that can be optimally aligned with an amino acid sequence selected 

from the group consisting of SEQ ID NOS: 6-10 and 263-514 to generate a similarity 

score of at least 430 using the BLOSUM62 matrix, a gap existence penalty of 1 1, and a 

gap extension penalty of 1. Some aspects of the invention pertain to GAT polypeptides 

comprising an amino acid sequence that can be optimally aligned with an amino acid 

sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-514 to 

generate a similarity score of at least 440, 445, 450, 455, 460, 465, 470, 475, 480, 485, 
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490, 495, 500, 505, 510, 515, 520, 525, 530, 535, 540, 545, 550, 555, 560, 565, 570, 575, 
580, 585, 590, 595, 600, 605, 610, 615, 620, 625, 630, 635, 640, 645, 650, 655, 660, 665, 
670, 675, 680, 685, 690, 695, 700, 705, 710, 715, 720, 725, 730, 735, 740, 745, 750, 755, 
or 760 using the BLOSUM62 matrix, a gap existence penalty of 1 1, and a gap extension 
5 penalty of 1. 

One aspect of the invention pertains to a GAT polypeptide comprising an 
amino acid sequence that can be optimally aligned with SEQ ID NO. 457 to generate a 
similarity score of at least 430 using the BLOSUM62 matrix, a gap existence penalty of 
11, and a gap extension penalty of 1. Some aspects of the invention pertain to GAT 

10 polypeptides comprising an amino acid sequence that can be optimally aligned with SEQ 
ID NO. 457 to generate a similarity score of at least 440, 445, 450, 455, 460, 465, 470, 
475, 480, 485, 490, 495, 500, 505, 510, 515, 520, 525, 530, 535, 540, 545, 550, 555, 560, 
565, 570, 575, 580, 585, 590, 595, 600, 605, 610, 615, 620, 625, 630, 635, 640, 645, 650, 
655, 660, 665, 670, 675, 680, 685, 690, 695, 700, 705, 710, 715, 720, 725, 730, 735, 740, 

15 745, 750, 755, or 760 using the BLOSUM62 matrix, a gap existence penalty of 11, and a 
gap extension penalty of 1. 

One aspect of the invention pertains to a GAT polypeptide comprising an 
amino acid sequence that can be optimally aligned with SEQ ID NO. 445 to generate a 
similarity score of at least 430 using the BLOSUM62 matrix, a gap existence penalty of 

20 11, and a gap extension penalty of 1. Some aspects of the invention pertain to GAT 

polypeptides comprising an amino acid sequence that can be optimally aligned with SEQ 
ID NO. 445 to generate a similarity score of at least 440, 445, 450, 455, 460, 465, 470, 
475, 480, 485, 490, 495, 500, 505, 510, 515, 520, 525, 530, 535, 540, 545, 550, 555, 560, 
565, 570, 575, 580, 585, 590, 595, 600, 605, 610, 615, 620, 625, 630, 635, 640, 645, 650, 

25 655, 660, 665, 670, 675, 680, 685, 690, 695, 700, 705, 710, 715, 720, 725, 730, 735, 740, 
745, 750, 755, or 760 using the BLOSUM62 matrix, a gap existence penalty of 1 1, and a 
gap extension penalty of 1. 

One aspect of the invention pertains to a GAT polypeptide comprising an 
amino acid sequence that can be optimally aligned with SEQ ID NO: 300 to generate a 

30 similarity score of at least 430 using the BLOSUM62 matrix, a gap existence penalty of 

11, and a gap extension penalty of 1. Some aspects of the invention pertain to GAT 

polypeptides comprising an amino acid sequence that can be optimally aligned with SEQ 

ID NO: 300 to generate a similarity score of at least 440, 445, 450, 455, 460, 465, 470, 

475, 480, 485, 490, 495, 500, 505, 510, 515, 520, 525, 530, 535, 540, 545, 550, 555, 560, 
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565, 570, 575, 580, 585, 590, 595, 600, 605, 610, 615, 620, 625, 630, 635, 640, 645, 650, 
655, 660, 665, 670, 675, 680, 685, 690, 695, 700, 705, 710, 715, 720, 725, 730, 735, 740, 
745, 750, 755, or 760 using the BLOSUM62 matrix, a gap existence penalty of 11, and a 
gap extension penalty of 1. 

Two sequences are "optimally aligned" when they are aligned for similarity 
scoring using a defined amino acid substitution matrix (e.g., BLOSUM62), gap existence 
penalty and gap extension penalty so as to arrive at the highest score possible for that pair 
of sequences. Amino acids substitution matrices and their use in quantifying the similarity 
between two sequences are well-known in the art and described, e.g., in Dayhoff et al. 
(1978) "A model of evolutionary change in proteins." In "Atlas of Protein Sequence and 
Structure," Vol. 5, Suppl. 3 (ed. M.O. Dayhoff), pp. 345-352. Natl. Biomed. Res. Found., 
Washington, DC and Henikoff et al. (1992) Proc. Natl. Acad. Sci. USA 89: 10915-10919. 
The BLOSUM62 matrix (Fig. 10) is often used as a default scoring substitution matrix in 
sequence alignment protocols such as Gapped BLAST 2.0. The gap existence penalty is 
imposed for the introduction of a single amino acid gap in one of the aligned sequences, 
and the gap extension penalty is imposed for each additional empty amino acid position 
inserted into an already opened gap. The alignment is defined by the amino acids 
positions of each sequence at which the alignment begins and ends, and optionally by the 
insertion of a gap or multiple gaps in one or both sequences, so as to arrive at the highest 
possible score. While optimal alignment and scoring can be accomplished manually, the 
process is facilitated by the use of a computer-implemented alignment algorithm, e.g., 
gapped BLAST 2.0, described in Altschul et al, (1997) Nucleic Acids Res. 25:3389-3402, 
and made available to the public at the National Center for Biotechnology Information 
Website (http://www.ncbi.nlm.nih.gov). Optimal alignments, including multiple 
alignments, can be prepared using, e.g., PSI-BLAST, available through 
http://www.ncbi.nlm.nih.gov and described by Altschul et'al, (1997) Nucleic Acids Res. 
25:3389-3402. 

With respect to an amino acid sequence that is optimally aligned with a 

reference sequence, an amino acid residue "corresponds to" the position in the reference 

sequence with which the residue is paired in the alignment The "position" is denoted by a 

number that sequentially identifies each amino acid in the reference sequence based on its 

position relative to the N-terminus. For example, in SEQ ID NO:300 position 1 is M, 

position 2 is I, position 3 is E, etc. When a test sequence is optimally aligned with SEQ 

ID NO:300, a residue in the test sequence that aligns with the E at position 3 is said to 
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"correspond to position 3" of SEQ ID NO:300. Owing to deletions, insertion, truncations, 
fusions, etc., that must be taken into account when determining an optimal alignment, in 
general the amino acid residue number in a test sequence as determined by simply 
counting from the N-terminal will not necessarily be the same as the number of its 

5 corresponding position in the reference sequence. For example, in a case where there is a 
deletion in an aligned test sequence, there will be no amino acid that corresponds to a 
position in the reference sequence at the site of deletion. Where there is an insertion in an 
aligned reference sequence, that insertion will not correspond to any amino acid position 
in the reference sequence. In the case of truncations or fusions there can be stretches of 

10 amino acids in either the reference or aligned sequence that do not correspond to any 
amino acid in the corresponding sequence. 

The term "GAT polypeptide" further refers to any polypeptide comprising 
an amino acid sequence having at least 40% sequence identity with an amino acid 
sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-514. Some 

15 aspects of the invention pertain to GAT polypeptides comprising an amino acid sequence 
having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% sequence 
identity with an amino acid sequence selected from the group consisting of SEQ ID NOS: 
6-10 and 263-5 14. 

One aspect of the invention pertains to a GAT polypeptide comprising an 
20 amino acid sequence having at least 40% sequence identity with SEQ ID NO. 457. Some 
aspects of the invention pertain to GAT polypeptides comprising an amino acid sequence 
having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% sequence 
identity with SEQ ID NO. 457. 

One aspect of the invention pertains to a GAT polypeptide comprising an 
25 amino acid sequence having at least 40% sequence identity with SEQ ID NO. 445. Some 
aspects of the invention pertain to GAT polypeptides comprising an amino acid sequence 
having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% sequence 
identity with SEQ ID NO. 445. 

One aspect of the invention pertains to a GAT polypeptide comprising an 
30 amino acid sequence having at least 40% sequence identity with SEQ ID NO. 300. Some 
aspects of the invention pertain to GAT polypeptides comprising an amino acid sequence 
having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% sequence 
identity with SEQ ID NO. 300. 
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The term "GAT polypeptide" further refers to any polypeptide comprising 

an amino acid sequence having at least 40% sequence identity with residues 1-96 of an 

amino acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263- 

514. Some aspects of the invention pertain to polypeptides comprising an amino acid 

sequence having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% 

sequence identity with residues 1-96 of an amino acid sequence selected from the group 

consisting of SEQ ID NOS: 6-10 and 263-514. 

One aspect of the invention pertains to a polypeptide comprising an amino 

acid sequence having at least 40% sequence identity with residues 1-96 of SEQ ID NO. 

457. Some aspects of the invention pertain to GAT polypeptides comprising an amino 

acid sequence having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% 

sequence identity with residues 1-96 of SEQ ID NO. 457. 

One aspect of the invention pertains to a GAT polypeptide comprising an 

amino acid sequence having at least 40% sequence identity with residues 1-96 of SEQ ID 

NO. 445. Some aspects of the invention pertain to GAT polypeptides comprising an 

amino acid sequence having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, 

or 99% sequence identity with residues 1-96 of SEQ ID NO. 445. 

One aspect of the invention pertains to a GAT polypeptide comprising an 

amino acid sequence having at least 40% sequence identity with residues 1-96 of SEQ ID 

NO. 300. Some aspects of the invention pertain to GAT polypeptides comprising an 

amino acid sequence having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, 

or 99% sequence identity with residues 1-96 of SEQ ID NO. 300. 

The term "GAT polypeptide" further refers to any polypeptide comprising 

an amino acid sequence having at least 40% sequence identity with residues 51-146 of an 

amino acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263- 

5 14. Some aspects of the invention pertain to polypeptides comprising an amino acid 

sequence having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% 

sequence identity with residues 51-146 of an amino acid sequence selected from the group 

consisting of SEQ ID NOS: 6-10 and 263-514. 

One aspect of the invention pertains to a polypeptide comprising an amino 

acid sequence having at least 40% sequence identity with residues 51-146 of SEQ ID NO. 

457. Some aspects of the invention pertain to GAT polypeptides comprising an amino 

acid sequence having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, or 99% 

sequence identity with residues 51-146 of SEQ ID NO. 457. 
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One aspect of the invention pertains to a GAT polypephae"comp^iffi|iaI^ , ' 
amino acid sequence having at least 40% sequence identity with residues 51-146 of SEQ 
ID NO. 445. Some aspects of the invention pertain to GAT polypeptides comprising an 
amino acid sequence having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, 
or 99% sequence identity with residues 51-146 of SEQ ID NO. 445. 

One aspect of the invention pertains to a GAT polypeptide comprising an 
amino acid sequence having at least 40% sequence identity with residues 51-146 of SEQ 
ID NO. 300. Some aspects of the invention pertain to GAT polypeptides comprising an 
amino acid sequence having at least 60%, 70%, 80%, 90%, 92%, 95%, 96%, 97%, 98%, 
or 99% sequence identity with residues 51-146 of SEQ ID NO. 300. 

As used herein, the term "identity" or "percent identity" when used with 
respect to a particular pair of aligned amino acid sequences, refers to the percent amino 
acid sequence identity that is obtained by ClustalW analysis (version W 1.8 available from 
European Bioinformatics Institute, Cambridge, UK), counting the number of identical 
matches in the alignment and dividing such number of identical matches by the greater of 
(i) the length of the aligned sequences, and (ii) 96, and using the following default 
ClustalW parameters to achieve slow/accurate pairwise alignments - Gap Open 
Penalty: 10; Gap Extension Penalty :0. 10; Protein weight matrix :Gonnet series; DNA 
weight matrix: IUB; Toggle Slow/Fast pairwise alignments = SLOW or FULL Alignment. 

In another aspect, the invention provides an isolated or recombinant 
polypeptide that comprises at least 20, or alternatively, 50, 75, 100, 125 or 140 contiguous 
amino acids of an amino acid sequence selected from the group consisting of SEQ ID 
NOS: 6-10 and 263-514. 

In another aspect, the invention provides an isolated or recombinant 
polypeptide that comprises at least 20, or alternatively, 50, 100 or 140 contiguous amino 
acids of SEQ ID NO:457. 

In another aspect, the invention provides an isolated or recombinant 
polypeptide that comprises at least 20, or alternatively, 50, 100 or 140 contiguous amino 
acids of SEQIDNO:445. 

In another aspect, the invention provides an isolated or recombinant 
polypeptide that comprises at least 20, or alternatively, 50, 100 or 140 contiguous amino 
acids of SEQ ID NO:300. 
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In another aspect, the invention provides a polypeptide comprising an 
amino acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263- 
514. 

Some preferred GAT polypeptides of the invention are characterized as 
5 follows. When optimally aligned with a reference amino acid sequence selected from the 
group consisting of SEQ ID NO:6-10 and 263-514, at least 90% of the amino acid residues 
in the polypeptide that correspond to the following positions conform to the following 
restrictions: (a) at positions 2, 4, 15, 19, 26, 28, 31, 45, 51, 54, 86, 90, 91, 97, 103, 105, 
106, 114, 123, 129, 139, and/or 145 the amino acid residue is Bl; and (b) at positions 3, 5, 
10 8, 10, 1 1, 14, 17, 18, 24, 27, 32, 37, 38, 47, 48, 49, 52, 57, 58, 61, 62, 63, 68, 69, 79, 80, 
82, 83, 89, 92, 100^ 101, 104, 119, 120, 124, 125, 126, 128, 131, 143, and/or 144 the 
amino acid residue is B2; wherein Bl is an amino acid selected from the group consisting 
of A, I, L, M, F, W, Y, and V; and B2 is an amino acid selected from the group consisting 
of R, N, D, C, Q, E, G, H, K, P, S, and T. When used to specify an amino acid or amino 
15 acid residue, the single letter designations A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, 
V, W, and Y have their standard meaning as used in the art and as provided in Table 2 
herein. 

Some preferred GAT polypeptides of the invention are characterized as 
follows. When optimally aligned with a reference amino acid sequence selected from the 

20 group consisting of SEQ ID NO:6-10 and 263-514, at least 80% of the amino acid residues 
in the polypeptide that correspond to the following positions conform to the following 
restrictions: (a) at positions 2, 4, 15, 19, 26, 28, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 
129, 139, and/or 145 the amino acid residue is Zl; (b) at positions 31 and/or 45 the amino 
acid residue is Z2; (c) at positions 8 and/or 89 the amino acid residue is Z3; (d) at 

25 positions 82, 92, 101 and/or 120 the amino acid residue is Z4; (e) at positions 3, 11, 27 
and/or 79 the amino acid residue is Z5; (f) at position 123 the amino acid residue is Zl or 
Z2; (g) at positions 12, 33, 35, 39, 53, 59, 112, 132, 135, 140, and/or 146 the amino acid 
residue is Zl or Z3; (h) at position 30 the amino acid residue is Zl or Z4; (i) at position 6 
the amino acid residue is Zl or Z6; (j) at positions 81 and/or 1 13 the amino acid residue is 

30 Z2 or Z3; (k) at positions 138 and/or 142 the amino acid residue is Z2 or Z4; (1) at 
positions 5, 17, 24, 57, 61, 124 and/or 126 the amino acid residue is Z3 or Z4; (m) at 
position 104 the amino acid residue is Z3 or Z5; (o) at positions 38, 52, 62 and/or 69 the 
amino acid residue is Z3 or Z6; (p) at positions 14, 119 and/or 144 the amino acid residue 
is Z4 or Z5; (q) at position 18 the amino acid residue is Z4 or Z6; (r) at positions 10, 32, 
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48, 63, 80 and/or 83 the amino acid residue is Z5 or Z6; (s) at position 40 the amino acid 
residue is Zl, Z2 or Z3; (t) at positions 65 and/or 96 the amino acid residue is Zl, Z3 or 
Z5; (u) at positions 84 and/or 1 15 the amino acid residue is Zl, Z3 or Z4; (v) at position 
93 the amino acid residue is Z2, Z3 or Z4; (w) at position 130 the amino acid residue is 
Z2, Z4 or Z6; (x) at positions 47 and/or 58 the amino acid residue is Z3, Z4 or Z6; (y) at 
positions 49, 68, 100 and/or 143 the amino acid residue is Z3, Z4 or Z5; (z) at position 131 
the amino acid residue is Z3, Z5 or Z6; (aa) at positions 125 and/or 128 the amino acid 
residue is Z4, Z5 or Z6; (ab) at position 67 the amino acid residue is Zl, Z3, Z4 or Z5; (ac) 
at position 60 the amino acid residue is Zl, Z4, Z5 or Z6; and(ad) at position 37 the amino 
acid residue is Z3, Z4, Z5 or Z6; wherein Zl is an amino acid selected from the group 
consisting of A, I, L, M, and V; Z2 is an amino acid selected from the group consisting of 
F, W, and Y; Z3 is an amino acid selected from the group consisting of N, Q, S, and T; Z4 
is an amino acid selected from the group consisting of R, H, and K; Z5 is an amino acid 
selected from the group consisting of D and E; and Z6 is an amino acid selected from the 
group consisting of C, G, and P. 

Some preferred GAT polypeptides of the invention are characterized as 
follows. When optimally aligned with a reference amino acid sequence selected from the 
group consisting of SEQ ID NO:6-10 and 263-514, at least 90% of the amino acid residues 
in the polypeptide that correspond to the following positions conform to the following 
restrictions: (a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 
107, 110, 117, 118, 121, and/or 141 the amino acid residue is Bl; and (b) at positions 16, 
21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 99, 102, 108, 109, 
111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; wherein Bl is 
an amino acid selected from the group consisting of A, I, L, M, F, W, Y, and V; and B2 is 
an amino acid selected from the group consisting of R, N, D, C, Q, E, G, H, K, P, S, and T. 

Some preferred GAT polypeptides of the invention are characterized as 
follows. When optimally aligned with a reference amino acid sequence selected from the 
group consisting of SEQ ID NO:6-10 and 263-514, at least 90% of the amino acid residues 
in the polypeptide that correspond to the following positions conform to the following 
restrictions: (a) at positions 1, 7, 9, 20, 36, 42, 50, 64, 72, 75, 76, 78, 94, 98, 110, 121, 
and/or 141 the amino acid residue is Zl; (b) at positions 13, 46, 56, 70, 107, 117, and/or 
118 the amino acid residue is Z2; (c) at positions 23, 55, 71, 77, 88, and/or 109 the amino 
acid residue is Z3; (d) at positions 16, 21, 41, 73, 85, 99, and/or 111 the amino acid 

residue is Z4; (e) at positions 34 and/or 95 the amino acid residue is Z5; (f) at position 22, 
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25, 29, 43, 44, 66, 74, 87, 102, 108, 116, 122, 127, 133, 134, 136, and/or 137 the amino 
acid residue is Z6; wherein Zl is an amino acid selected from the group consisting of A, I, 
L, M, and V; Z2 is an amino acid selected from the group consisting of F, W, and Y; Z3 is 
an amino acid selected from the group consisting of N, Q, S, and T; Z4 is an amino acid 
5 selected from the group consisting of R, H, and K; Z5 is an amino acid selected from the 
group consisting of D and E; and Z6 is an amino acid selected from the group consisting 
of C, G, and P. 

Some preferred GAT polypeptides of the invention are characterized as 
follows. When optimally aligned with a reference amino acid sequence selected from the 

10 group consisting of SEQ ID NO:6-10 and 263-514, at least 80% of the amino acid residues 
in the polypeptide that correspond to the following positions conform to the following 
restrictions: (a) at position 2 the amino acid residue is I or L; (b) at position 3 the amino 
acid residue is E or D; (c) at position 4 the amino acid residue is V, A or I; (d) at position 5 
the amino acid residue is K, R or N; (e) at position 6 the amino acid residue is P or L; (f) at 

15 position 8 the amino acid residue is N, S or T; (g) at position 10 the amino acid residue is 
E or G; (h) at position 11 the amino acid residue is D or E; (i) at position 12 the amino 
acid residue is T or A; (j) at position 14 the amino acid residue is E or K; (k) at position 15 
the amino acid residue is I or L; (1) at position 17 the amino acid residue is H or Q; (m) at 
position 18 the amino acid residue is R, C or K; (n) at position 19 the amino acid residue is 

20 I or V; (o) at position 24 the amino acid residue is Q or R; (p) at position 26 the amino 

acid residue is L or I; (q) at position 27 the amino acid residue is E or D; (r) at position 28 
the amino acid residue is A or V; (s) at position 30 the amino acid residue is K, M or R; (t) 
at position 31 the amino acid residue is Y or F; (u) at position 32 the amino acid residue is 
E or G; (v) at position 33 the amino acid residue is T, A or S; (w) at position 35 the amino 

25 acid residue is L, S or M; (x) at position 37 the amino acid residue is R, G, E or Q; (y) at 
position 38 the amino acid residue is G or S; (z) at position 39 the amino acid residue is T, 
A or S; (aa) at position 40 the amino acid residue is F, L or S; (ab) at position 45 the 
amino acid residue is Yor F; (ac) at position 47 the amino acid residue is R, Q or G; (ad) 
at position 48 the amino acid residue is G or D; (ae) at position 49 the amino acid residue 

30 is K, R, E or Q; (af) at position 51 the amino acid residue is I or V; (ag) at position 52 the 

amino acid residue is S, C or G; (ah) at position 53 the amino acid residue is I or T; (ai) at 

position 54 the amino acid residue is A or V; (aj) at position 57 the amino acid residue is 

H or N; (ak) at position 58 the amino acid residue is Q, K, N or P; (al) at position 59 the 

amino acid residue is A or S; (am) at position 60 the amino acid residue is E, K, G, V or 
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D; (an) at position 61 the amino acid residue is H or Q; (ao) at position 62 the amino acid 

residue is P, S or T; (ap) at position 63 the amino acid residue is E, G or D; (aq) at position 

65 the amino acid residue is E, D, V or Q; (ar) at position 67 the amino acid residue is Q, 

E, R, L, H or K; (as) at position 68 the amino acid residue is K, R, E, or N; (at) at position 

69 the amino acid residue is Q or P; (au) at position 79 the amino acid residue is E or D; 

(av) at position 80 the amino acid residue is G or E; (aw) at position 81 the amino acid 

residue is Y, N or F; (ax) at position 82 the amino acid residue is R or H; (ay) at position 

83 the amino acid residue is E, G or D; (az) at position 84 the amino acid residue is Q, R 

or L; (ba) at position 86 the amino acid residue is A or V; (bb) at position 89 the amino 

acid residue is T or S; (be) at position 90 the amino acid residue is L or I; (bd) at position 

91 the amino acid residue is I or V; (be) at position 92 the amino acid residue is R or K; 

(bf) at position 93 the amino acid residue is H, Y or Q; (bg) at position 96 the amino acid 

residue is E, A or Q; (bh) at position 97 the amino acid residue is L or I; (bi) at position 

100 the amino acid residue is K, R, N or E; (bj) at position 101 the amino acid residue is K 

or R; (bk) at position 103 the amino acid residue is A or V; (bl) at position 104 the amino 

acid residue is D or N; (bm) at position 105 the amino acid residue is L or M; (bn) at 

position 106 the amino acid residue is L or I; (bo) at position 1 12 the amino acid residue is 

T or I; (bp) at position 1 13 the amino acid residue is S, T or F; (bq) at position 1 14 the 

amino acid residue is A or V; (br) at position 115 the amino acid residue is S, R or A; (bs) 

at position 119 the amino acid residue is K, E or R; (bt) at position 120 the amino acid 

residue is K or R; (bu) at position 123 the amino acid residue is F or L; (bv) at position 

124 the amino acid residue is S or R; (bw) at position 125 the amino acid residue is E, K, 

G or D; (bx) at position 126 the amino acid residue is Q or H; (by) at position 128 the 

amino acid residue is E, G or K; (bz) at position 129 the amino acid residue is V, I or A; 

(ca) at position 130 the amino acid residue is Y, H, F or C; (cb) at position 131 the amino 

acid residue is D, G, N or E; (cc) at position 132 the amino acid residue is I, T, A, M, V or 

L; (cd) at position 135 the amino acid residue is V, T, A or I; (ce) at position 138 the 

amino acid residue is H or Y; (cf) at position 139 the amino acid residue is I or V; (eg) at 

position 140 the amino acid residue is L or S; (ch) at position 142 the amino acid residue 

is Y or H; (ci) at position 143 the amino acid residue is K, T or E; (cj) at position 144 the 

amino acid residue is K, E or R; (ck) at position 145 the amino acid residue is L or I; and 

(cl) at position 146 the amino acid residue is T or A. 

Some preferred GAT polypeptides of the invention are characterized as 

follows. When optimally aligned with a reference amino acid sequence selected from the 
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group consisting of SEQ ID NO:6-10 and 263-514, at least 80% of the amino acid residues 
in the polypeptide that correspond to the following positions conform to the following 
restrictions: (a) at position 9, 76, 94 and 110 the amino acid residue is A; (b) at position 29 
and 108 the amino acid residue is C; (c) at position 34 the amino acid residue is D; (d) at 
position 95 the amino acid residue is E; (e) at position 56 the amino acid residue is F; (f) at 
position 43, 44, 66, 74, 87, 102, 116, 122, 127 and 136 the amino acid residue is G; (g) at 
position 41 the amino acid residue is H; (h) at position 7 the amino acid residue is I; (i) at 
position 85 the amino acid residue is K; (j) at position 20, 36, 42, 50, 72, 78, 98 and 121 
the amino acid residue is L; (k) at position 1, 75 and 141 the amino acid residue is M; G) at 
position 23, 64 and 109 the amino acid residue is N; (m) at position 22, 25, 133, 134 and 
137 the amino acid residue is P; (n) at position 71 the amino acid residue is Q; (o) at 
position 16, 21, 73, 99 and 111 the amino acid residue is R; (p) at position 55 and 88 the 
amino acid residue is S; (q) at position 77 the amino acid residue is T; (r) at position 107 
the amino acid residue is W; and (s) at position 13, 46, 70, 117 and 118 the amino acid 
residue is Y. 

Some preferred GAT polypeptides of the invention are characterized as 
follows. When optimally aligned with a reference amino acid sequence selected from the 
group consisting of SEQ ID NO:6-10 and 263-514, the amino acid residue in the 
polypeptide that correspond to position 28 is V or A. Valine at the 28 position generally 
correlates with reduced K M , while alanine at that position generally correlates with 
increased W Other preferred GAT polypeptides are characterized by having 127 (i.e., an 
I at position 27), M30, S35, R37, S39, G48, K49, N57, Q58, P62, Q65, Q67, K68, E83, 
S89, A96, E96, R101, T112, A114, K119, K120, E128, V129, D131, T131, V134, R144, 
1145, or T146, or any combination thereof. 

Some preferred GAT polypeptides of the invention comprise an amino acid 
sequence selected from the group consisting of SEQ ID NOS:6-10 and 263-514. 

The invention further provides preferred GAT polypeptides that are 
characterized by a combination of the foregoing amino acid residue position restrictions. 

In addition, the invention provides GAT polynucleotides encoding the 
preferred GAT polypeptides described above, and complementary nucleotide sequences 
thereof. 

Some aspects of the invention pertain particularly to the subset of any of 

the above-described categories of GAT polypeptides having GAT activity, as described 

herein. These GAT polypeptides are preferred, for example, for use as agents for 
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conferring glyphosate resistance upon a plant. Examples of desired levels of GAT activity 
are described herein. 

In one aspect, the GAT polypeptides comprise an amino acid sequence 
encoded by a recombinant or isolated form of naturally occurring nucleic acids isolated 
from a natural source, e.g., a bacterial strain. Wild-type polynucleotides encoding such 
GAT polypeptides may be specifically screened for by standard techniques known in the 
art. The polypeptides defined by SEQ ID NO:6 to SEQ ID NO: 10, for example, were 
discovered by expression cloning of sequences from Bacillus strains exhibiting GAT 
activity, as described in more detail below. 

The invention also includes isolated or recombinant polypeptides which are 
encoded by an isolated or recombinant polynucleotide comprising a nucleotide sequence 
which hybridizes under stringent conditions over substantially the entire length of a 
nucleotide sequence selected from the group consisting of SEQ ID NOS: 1-5 and 11-262, 
their complements, and nucleotide sequences encoding an amino acid sequence selected 
from the group consisting of SEQ ID NOS: 6-10 and 263-514, including their 
complements. 

The invention further includes any polypeptide having GAT activity that is 
encoded by a fragment of any of the GAT-encoding polynucleotides described herein. 

The invention also provides fragments of GAT polypeptides that can be 
spliced together to form a functional GAT polypeptide. Splicing can be accomplished in 
vitro or in vivo, and can involve cis or trans (i.e., intramolecular or intermolecular) 
splicing. The fragments themselves can, but need not, have GAT activity. For example, 
two or more segments of a GAT polypeptide can be separated by inteins; removal of the 
intein sequence by cis-splicing results in a functional GAT polypeptide. In another 
example, an encrypted GAT polypeptide can be expressed as two or more separate 
fragments; trans-splicing of these segments results in recovery of a functional GAT 
polypeptide. Various aspects of cis and trans splicing, gene encryption, and introduction 
of intervening sequences are described in more detail in US patent application Nos. 
09/517,933 and 09/710,686, both of which are incorporated by reference herein in their 
entirety. 

In general, the invention includes any polypeptide encoded by a modified 
GAT polynucleotide derived by mutation, recursive sequence recombination, and/or 
diversification of the polynucleotide sequences described herein. In some aspects of the 
invention, a GAT polypeptide is modified a by single or multiple amino acid substitution, 
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a deletion, an insertion, or a combination of one or more of these types of modifications. 
Substitutions can be conservative, or non-conservative, can alter function or not, and can 
add new function. Insertions and deletions can be substantial, such as the case of a 
truncation of a substantial fragment of the sequence, or in the fusion of additional 
5 sequence, either internally or at N or C terminal- In some embodiments of the invention, a 
GAT polypeptide is part of a fusion protein comprising a functional addition such as, for 
example, a secretion signal, a chloroplast transit peptide, a purification tag, or any of 
numerous other functional groups that will be apparent to the skilled artisan, and which are 
described in more detail elsewhere in this specification. 
10 Polypeptides of the invention may contain one or more modified amino 

acid. The presence of modified amino acids may be advantageous in, for example, (a) 
increasing polypeptide in vivo half-life, (b) reducing or increasing polypeptide 
antigenicity, (c) increasing polypeptide storage stability. Amino acid(s) are modified, for 
example, co-translationally or post-translationally during recombinant production (e.g., N- 
15 linked glycosylation at N-X-S/T motifs during expression in mammalian cells) or 
modified by synthetic means. 

Non-limiting examples of a modified amino acid include a glycosylated 
amino acid, a sulfated amino acid, a prenlyated (e.g., farnesylated, geranylgeranylated) 
amino acid, an acetylated amino acid, an acylated amino acid, a PEG-ylated amino acid, a 
20 biotinylated amino acid, a carboxylated amino acid, a phosphorylated amino acid, and the 
like. References adequate to guide one of skill in the modification of amino acids are 
replete throughout the literature. Example protocols are found in Walker (1998) Protein 
Protocols on CD-ROM Human Press, Towata, NJ. 

Recombinant methods for producing and isolating GAT polypeptides of the 
25 invention are described herein. In addition to recombinant production, the polypeptides 
may be produced by direct peptide synthesis using solid-phase techniques (e.g., Stewart et 
al. (1969) Solid-Phase Peptide Synthesis , WH Freeman Co, San Francisco; Merrifield J 
(1963) 1 Am. Chem. Soc . 85:2149-2154). Peptide synthesis may be performed using 
manual techniques or by automation. Automated synthesis may be achieved, for example, 
30 using Applied Biosystems 43 1 A Peptide Synthesizer (Perkin Elmer, Foster City, Calif.) in 
accordance with the instructions provided by the manufacturer. For example, 
subsequences may be chemically synthesized separately and combined using chemical 
methods to provide full-length GAT polypeptdides. Peptides can also be ordered from a 
variety of sources. 
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In another aspect of the invention, a GAT polypeptide of the invention is 
used to produce antibodies which have, e.g., diagnostic uses, for example, related to the 
activity, distribution, and expression of GAT polypeptides, for example, in various tissues 
of a transgenic plant. 

GAT homologue polypeptides for antibody induction do not. require 
biological activity; however, the polypeptide or oligopeptide must be antigenic. Peptides 
used to induce specific antibodies may have an amino acid sequence consisting of at least 
10 amino acids, preferably at least 15 or 20 amino acids. Short stretches of a GAT 
polypeptide may be fused with another protein, such as keyhole limpet hemocyanin, and 
antibody produced against the chimeric molecule. 

Methods of producing polyclonal and monoclonal antibodies are known to 
those of skill in the art, and many antibodies are available. See, e.g., Coligan (1991) 
Current Prot ™^1 R in Tm munology Wiley/Greene, NY; and Harlow and Lane (1989) 
Antibodies: A Laboratory Manual Cold Spring Harbor Press, NY; Stites et al. (eds.) Basic 
and Clinical Immunology (4th ed.) Lange Medical Publications, Los Altos, CA, and 
references cited therein; Goding (1986) Monoclonal Antibodies: Principles and Practice 
(2d ed.) Academic Press, New York, NY; and Kohler and Milstein (1975) Nature 256: 
495-497. Other suitable techniques for antibody preparation include selection of libraries 
of recombinant antibodies in phage or similar vectors. See, Huse et al. (1989) Science 
246: 1275-1281; and Ward, et al. (1989) Nature 341: 544-546. Specific monoclonal and 
polyclonal antibodies and antisera will usually bind with a K D of at least about 0.1 ^M, 
preferably at least about 0.01 MM or better, and most typically and preferably, 0.001 
or better. 

Additional details antibody production and engineering techniques can be 
found in Borrebaeck (ed) (1995) Antibody Engineering. 2 nd Edition Freeman and 
Company, NY (Borrebaeck); McCafferty et al. (1996) Antibody Engineering, A Practical 
Approach IRL at Oxford Press, Oxford, England (McCafferty), and Paul (1995) Antibody 
Engineering Protocols Humana Press, Towata, NJ (Paul). 

Sequence Variations 

GAT polypeptides of the present invention include conservatively modified 
variations of the sequences disclosed herein as SEQ ID NOS: 6-10 and 263-514. Such 
conservatively modified variations comprise substitutions, additions or deletions which 
alter, add or delete a single amino acid or a small percentage of amino acids (typically less 
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than about 5%, more typically less than about 4%, 2%, or 1%) in any of SEQ ID NOS: 6- 
10 and 263-514. 

For example, a conservatively modified variation (e.g., deletion) of the 146 
amino acid polypeptide identified herein as SEQ ID NO:6 will have a length of at least 
140 amino acids, preferably at least 141 amino acids, more preferably at least 144 amino 
acids, and still more preferably at least 146 amino acids, corresponding to a deletion of 
less than about 5%, 4%, 2% or about 1%, or less of the polypeptide sequence. 

Another example of a conservatively modified variation (e.g., a 
"conservatively substituted variation") of the polypeptide identified herein as SEQ ID 
NO:6 will contain "conservative substitutions", according to the six substitution groups set 
forth in Table 2 (infra), in up to about 7 residues (i.e., less than about 5%) of the 146 
amino acid polypeptide. 

The GAT polypeptide sequence homologues of the invention, including 
conservatively substituted sequences, can be present as part of larger polypeptide 
sequences such as occur in a GAT polypeptide, in a GAT fusion with a signal sequence, 
e.g., a chloraplast targeting sequence, or upon the addition of one or more domains for 
purification of the protein (e.g., poly his segments, FLAG tag segments, etc.). In the latter 
case, the additional functional domains have little or no effect on the activity of the GAT 
portion of the protein, or where the additional domains can be removed by post synthesis 
processing steps such as by treatment with a protease. 

Defining Polypeptides bv Immunoreactivity 

Because the polypeptides of the invention provide a new class of enzymes 
with a defined activity, i.e., the acetylation of glyphosate, the polypeptides also provide 
new structural features which can be recognized, e.g., in immunological assays. The 
generation of antisera which specifically binds the polypeptides of the invention, as well 
as the polypeptides which are bound by such antisera, are a feature of the invention. 

The invention includes GAT polypeptides that specifically bind to or that 
are specifically immunoreactive with an antibody or antisera generated against an 
immunogen comprising an amino acid sequence selected from one or more of SEQ ID 
NO:6 to SEQ ID NO: 10. To eliminate cross-reactivity with other GAT homologues, the 
antibody or antisera is subtracted with available related proteins, such as those represented 
by the proteins or peptides corresponding to GenBank accession numbers available as of 
the filing date of this application, and exemplified by CAA70664, Z99109 and Y09476. 
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Where the accession number corresponds to a nucleic acid, a polypeptide encoded by the 
nucleic acid is generated and used for antibody/antisera subtraction purposes. Figure 3 
tabulates the relative identity between exemplary GAT polypeptides and the most closely 
related sequence available in Genbank, YitL The function of native YitI has yet to be 
elucidated, but the enzyme has been shown to possess detectable GAT activity. 

In one typical format, the immunoassay uses a polyclonal antiserum which 
was raised against one or more polypeptide comprising one or more of the sequences 
corresponding to one or more of SEQ ID NOS: 6-10 and 263-514, or a substantial 
subsequence thereof (i.e., at least about 30% of the full length sequence provided). The 
full set of potential polypeptide immunogens derived from SEQ ID NOS: 6-10 and 263- 
514 are collectively referred to below as "the immunogenic polypeptides." The resulting 
antisera is optionally selected to have low cross-reactivity against other related sequences 
and any such cross-reactivity is removed by immunoabsorbtion with one or more of the 
related sequences, prior to use of the polyclonal antiserum in the immunoassay. 

In order to produce antisera for use in an immunoassay, one or more of the 
immunogenic polypeptides is produced and purified as described herein. For example, 
recombinant protein may be produced in a bacterial cell line. An inbred strain of mice 
(used in this assay because results are more reproducible due to the virtual genetic identity 
of the mice) is immunized with the immunogenic protein(s) in combination with a 
standard adjuvant, such as Freund's adjuvant, and a standard mouse immunization protocol 
(see, Harlow and Lane (1988) Antibodies, A laboratory Manual, Cold Spring Harbor 
Publications, New York, for a standard description of antibody generation, immunoassay 
formats and conditions that can be used to determine specific immunoreactivity). 
Alternatively, one or more synthetic or recombinant polypeptide derived from the 
sequences disclosed herein is conjugated to a carrier protein and used as an immunogen. 

Polyclonal sera are collected and titered against the immunogenic 
polypeptide in an immunoassay, for example, a solid phase immunoassay with one or 
more of the immunogenic proteins immobilized on a solid support. Polyclonal antisera 
with a titer of 10 6 or greater are selected, pooled and subtracted with related polypeptides, 
e.g., those identified from GENBANK as noted, to produce subtracted pooled titered 
polyclonal antisera. 

The subtracted pooled titered polyclonal antisera are tested for cross 
reactivity against the related polypeptides. Preferably at least two of the immunogenic 

GATs are used in this determination, preferably in conjunction with at least two of related 
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polypeptides, to identify antibodies which are specifically bound by the immunogenic 
protein(s). 

In this comparative assay, discriminatory binding conditions are determined 

for the subtracted titered polyclonal antisera which result in at least about a 5-10 fold 

higher signal to noise ratio for binding of the titered polyclonal antisera to the 

immunogenic GAT polypeptides as compared to binding to the related polypeptides. That 

is, the stringency of the binding reaction is adjusted by the addition of non-specific 

competitors such as albumin or non-fat dry milk, or by adjusting salt conditions, 

temperature, or the like. These binding conditions are used in subsequent assays for 

determining whether a test polypeptide is specifically bound by the pooled subtracted 

polyclonal antisera. In particular, test polypeptides which show at least a 2-5x higher 

signal to noise ratio than the control polypeptides under discriminatory binding conditions, 

and at least about a Vi signal to noise ratio as compared to the immunogenic 

polypeptide(s), shares substantial structural similarity with the immunogenic polypeptide 

as compared to known GAT, and is, therefore a polypeptide of the invention. 

In another example, immunoassays in the competitive binding format are 

used for detection of a test polypeptide. For example, as noted, cross-reacting antibodies 

are removed from the pooled antisera mixture by immunoabsorbtion with the control GAT 

polypeptides. The immunogenic polypeptide(s) are then immobilized to a solid support 

which is exposed to the subtracted pooled antisera. Test proteins are added to the assay to 

compete for binding to the pooled subtracted antisera. The ability of the test protein(s) to 

compete for binding to the pooled subtracted antisera as compared to the immobilized 

protein(s) is compared to the ability of the immunogenic polypeptide(s) added to the assay 

to compete for binding (the immunogenic polypeptides compete effectively with the 

immobilized immunogenic polypeptides for binding to the pooled antisera). The percent 

cross-reactivity for the test proteins is calculated, using standard calculations. 

In a parallel assay, the ability of the control proteins to compete for binding 

to the pooled subtracted antisera is optionally determined as compared to the ability of the 

immunogenic polypeptide(s) to compete for binding to the antisera. Again, the percent 

cross-reactivity for the control polypeptides is calculated, using standard calculations. 

Where the percent cross-reactivity is at least 5-10x as high for the test polypeptides, the 

test polypeptides are said to specifically bind the pooled subtracted antisera 

In general, the immunoabsorbed and pooled antisera can be used in a 

competitive binding immunoassay as described herein to compare any test polypeptide to 
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the immunogenic polypeptide(s). In order to make this comparison, the two polypeptides 
are each assayed at a wide range of concentrations and the amount of each polypeptide 
required to inhibit 50% of the binding of the subtracted antisera to the immobilized protein 
is determined using standard techniques. If the amount of the test polypeptide required is 
less than twice the amount of the immunogenic polypeptide that is required, then the test 
polypeptide is said to specifically bind to an antibody generated to the immunogenic 
protein, provided the amount is at least about 5-10x as high as for a control polypeptide. 

As a final determination of specificity, the pooled antisera is optionally 
fully immunosorbed with the immunogenic polypeptide(s) (rather than the control 
polypeptides) until little or no binding of the resulting immunogenic polypeptide 
subtracted pooled antisera to the immunogenic polypeptide(s) used in the immunosorbtion 
is detectable. This fully immunosorbed antisera is then tested for reactivity with the test 
polypeptide. If little or no reactivity is observed (i.e., no more than 2x the signal to noise 
ratio observed for binding of the fully immunosorbed antisera to the immunogenic 
polypeptide), then the test polypeptide is specifically bound by the antisera elicited by the 
immunogenic protein. 

GLYPHOS ATE N-ACETYLTRANSFERASE POLYN UCLEOTIDES 

In one aspect, the invention provides a novel family of isolated or 

recombinant polynucleotides referred to herein as "glyphosate N-acetyltransferase 
polynucleotides" or "GAT polynucleotides." GAT polynucleotide sequences are 
characterized by the ability to encode a GAT polypeptide. In general, the invention 
includes any nucleotide sequence that encodes any of the novel GAT polypeptides 
described herein. In some aspects of the invention, a GAT polynucleotide that encodes a 
GAT polypeptide with GAT activity is preferred. 

In one aspect, the GAT polynucleotides comprise recombinant or isolated 
forms of naturally occurring nucleic acids isolated from an organism, e,g, a bacterial 
strain. Exemplary GAT polynucleotides, e.g., SEQ ID NO:l to SEQ ID NO:5, were 
discovered by expression cloning of sequences from Bacillus strains exhibiting GAT 
activity. Briefly, a collection of approximately 500 Bacillus and Pseudomonas strains 
were screened for native ability to N-acetylate glyphosate. Strains were grown in LB 
overnight, harvested by centrifugation, permeabilizied in dilute toluene, and then washed 
and resuspended in a reaction mix containing buffer, 5 mM glyphosate, and 200 \xM 
acetyl-CoA. The cells were incubated in the reaction mix for between 1 and 48 hours, at 

which time an equal volume of methanol was added to the reaction. The cells were then 
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pelleted by centrifugation and the supernatant was filtered before analysis by parent ion 
mode mass spectrometry. The product of the reaction was positively identified as N- 
acetylglyphosate by comparing the mass spectrometry profile of the reaction mix to an N- 
acetylglyphosate standard as shown in Figure 2. Product detection was dependent on 
inclusion of both substrates (acetylCoA and glyphosate) and was abolished by heat 
denaturing the bacterial cells. 

Individual GAT polynucleotides were then cloned from the identified 
strains by functional screening. Genomic DNA was prepared and partially digested with 
Sau3Al enzyme. Fragments of approximately 4 Kb were cloned into an E. coli expression 
vector and transformed into electrocompetent E. coli. Individual clones exhibiting GAT 
activity were identified by mass spectrometry following a reaction as described previously 
except that the toluene wash was replaced by permeabilization with PMBS. Genomic 
fragments were sequenced and the putative GAT polypeptide-encoding open reading 
frame identified. Identity of the GAT gene was confirmed by expression of the open 
reading frame in E. coli and detection of high levels of N-acetylglyphosate produced from 
reaction mixtures. 

In another aspect of the invention, GAT polynucleotides are produced by 
diversifying, e.g., recombining and/or mutating one or more naturally occurring, isolated, 
or recombinant GAT polynucleotides. As described in more detail elsewhere herein, it is 
often possible to generate diversified GAT polynucleotides encoding GAT polypeptides 
with superior functional attributes, e.g., increased catalytic function, increased stability, 
higher expression level, than a GAT polynucleotide used as a substrate or parent in the 
diversification process. 

The polynucleotides of the invention have a variety of uses in, for example: 
recombinant production (i.e., expression) of the GAT polypeptides of the invention; as 
transgenes (e.g., to confer herbicide resistance in transgenic plants); as selectable markers 
for transformation and plasmid maintenance; as immunogens; as diagnostic probes for the 
presence of complementary or partially complementary nucleic acids (including for 
detection of natural GAT coding nucleic acids; as substrates for further diversity 
generation, e.g., recombination reactions or mutation reactions to produce new and/or 
improved GAT homologues, and the like. 

It is important to note that certain specific, substantial and credible utilities 

of GAT polynucleotides do not require that the polynucleotide encode a polypeptide with 

substantial GAT activity. For example, GAT polynucleotides that do not encode active 
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enzymes can be valuable sources of parental polynucleotides for use in diversification 
procedures to arrive at GAT polynucleotide variants, or non-GAT polynucleotides, with 
desirable functional properties (e.g., high kcat or kcat/Km, low Km, high stability towards 
heat or other environmental factor, high transcription or translation rates, resistance to 
proteolytic cleavage, reducing antigenicity, etc.). For example, nucleotide sequences 
encoding protease variants with little or no detectable activity have been used as parent 
polynucleotides in DNA shuffling experiments to produce progeny encoding highly active 
proteases (Ness et al. (1999) Nature Biotechnology 17:893-96). 

Polynucleotide sequences produced by diversity generation methods or 
recursive sequence recombination ("RSR") methods (e.g., DNA shuffling) are a feature of 
the invention. Mutation and recombination methods using the nucleic acids described 
herein are a feature of the invention. For example, one method of the invention includes 
recursively recombining one or more nucleotide sequences of the invention as described 
above and below with one or more additional nucleotides. The recombining steps are 
optionally performed in vivo, ex vivo, in silico or in vitro. Said diversity generation or 
recursive sequence recombination produces at least one library of recombinant modified 
GAT polynucleotides. Polypeptides encoded by members of this library are included in 
the invention. 

Also contemplated are uses of polynucleotides, also referred to herein as 
oligonucleotides, typically having at least 12 bases, preferably at least 15, more preferably 
at least 20, 30, or 50 or more bases, which hybridize under stringent or highly stringent 
conditions to a GAT polynucleotide sequence. The polynucleotides may be used as 
probes, primers, sense and antisense agents, and the like, according to methods as noted 
herein. 

In accordance with the present invention, GAT polynucleotides, including 
nucleotide sequences that encode GAT poolypeptides, fragments of GAT polypeptides, 
related fusion proteins, or functional equivalents thereof, are used in recombinant DNA 
molecules that direct the expression of the GAT polypeptides in appropriate host cells, 
such as bacterial or plant cells. Due to the inherent degeneracy of the genetic code, other 
nucleic acid sequences which encode substantially the same or a functionally equivalent 
amino acid sequence can also be used to clone and express the GAT polynucleotides. 

The invention provides GAT polynucleotides that encode transcription 

and/or translation product that are subsequently spliced to ultimately produce functional 

GAT polypeptides. Splicing can be accomplished in vitro or in vivo, and can involve cis 
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or trans splicing. The substrate for splicing can be polynucleotides (e.g., RNA transcripts) 
or polypeptides. An example of cis splicing of a polynucleotide is where an intron 
inserted into a coding sequence is removed arid the two flanking exon regions are spliced 
to generate a GAT polypeptide encoding sequence. An example of trans splicing would 
be where a GAT polynucleotide is encrypted by separating the coding sequence into two 
or more fragments that can be separately transcribed and then spliced to form the full- 
length GAT encoding sequence. The use of a splicing enhancer sequence (which can be 
introduced into a construct of the invention) can facilitate splicing either in cis or trans. 
Cis and trans splicing of polypeptides are described in more detail elsehwhere herein. 
More detailed description of cis and trans splicing can be found in US patent application 
Nos. 09/517,933 and 09/710,686. 

Thus, some GAT polynucleotides do not directly encode a full-length GAT 
polypeptide, but rather encode a fragment or fragments of a GAT polypeptide. These 
GAT polynucleotides can be used to express a functional GAT polypeptide through a 
mechanism involving splicing, where splicing can occur at the level of polynucleotide 
(e.g., intron/exon) and/or polypeptide (e.g., intein/extein). This can be useful, for 
example, in controlling expression of GAT activity, since functional GAT polypeptide will 
only be expressed if all required fragments are expressed in an environment that permits 
splicing processes to generate functional product. In another example, introduction of one 
or more insertion sequences into a GAT polynucleotide can f acilitate recombination with a 
low homology polynucleotide; use of an intron or intein for the insertion sequence 
facilitates the removal of the intervening sequence, thereby restoring function of the 
encoded variant. 

As will be understood by those of skill in the art, it can be advantageous to 
modify a coding sequence to enhance its expression in a particular host. The genetic code 
is redundant with 64 possible codons, but most organisms preferentially use a subset of 
these codons. The codons that are utilized most often in a species are called optimal 
codons, and those not utilized very often are classified as rare or low-usage codons (see. 
e.g., Zhang SP et al. (1991) Gene 105:61-72). Codons can be substituted to reflect the 
preferred codon usage of the host, a process sometimes called "codon optimization" or 
"controlling for species codon bias." 

Optimized coding sequence containing codons preferred by a particular 

prokaryotic or eukaryotic host (see also, Murray, E. et al. (1989) Nuc. Acids Res. 17:477- 

508) can be prepared, for example, to increase the rate of translation or to produce 
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recombinant RNA transcripts having desirable properties, such as a longer half-life, as 
compared with transcripts produced from a non-optimized sequence. Translation stop 
codons can also be modified to reflect host preference. For example, preferred stop 
codons for 5. cerevisiae and mammals are UAA and UGA respectively. The preferred 
stop codon for monocotyledonous plants is UGA, whereas insects and E. coli prefer to use 
UAA as the stop codon (Dalphin ME et aL (1996) Nuc. Acids Res. 24: 216-218). 
Methodology for optimizing a nucleotide sequence for expression in a plant is provided, 
for example, in U.S. Patent No. 6,015,891, and references cited therein. 

One embodiment of the invention includes a GAT polynucleotide having 
optimal codons for expression in a relevant host, e.g., a transgenic plant host. This is 
particularly desirable when a GAT polynucleotide of bacterial origin is introduced into a 
transgenic plant, e.g., to confer glyphosate resistance to the plant. 

The polynucleotide sequences of the present invention can be engineered in 
order to alter a GAT polynucleotide for a variety of reasons, including but not limited to, 
alterations which modify the cloning, processing and/or expression of the gene product. 
For example, alterations may be introduced using techniques that are well known in the 
art, e.g., site-directed mutagenesis, to insert new restriction sites, alter glycosylation 
patterns, change codon preference, introduce splice sites, etc. 

As described in more detail herein, the polynucleotides of the invention 
include sequences which encode novel GAT polypeptides and sequences complementary 
to the coding sequences, and novel fragments of coding sequence and complements , 
thereof. The polynucleotides can be in the form of RNA or in the form of DNA, and 
include mRNA, cRNA, synthetic RNA and DNA, genomic DNA and cDNA. The 
polynucleotides can be double-stranded or single-stranded, and if single-stranded, can be 
the coding strand or the non-coding (anti-sense, complementary) strand. The 
polynucleotides optionally include the coding sequence of a GAT polypeptide (i) in 
isolation, (ii) in combination with additional coding sequence, so as to encode, e.g., a 
fusion protein, a pre-protein, a prepro-protein, or the like, (iii) in combination with non- 
coding sequences, such as introns or inteins, control elements such as a promoter, an 
enhancer, a terminator element, or 5 1 and/or 3' untranslated regions effective for expression 
of the coding sequence in a suitable host, and/or (iv) in a vector or host environment in 
which the GAT polynucleotide is a heterologous gene. Sequences can also be found in 
combination with typical compositional formulations of nucleic acids, including in the 

presence of carriers, buffers, adjuvants, excipients and the like. 
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Polynucleotides and oligonucleotides of the invention can be prepared by 
standard solid-phase methods, according to known synthetic methods. Typically, 
fragments of up to about 100 bases are individually synthesized, then joined (e.g., by 
enzymatic or chemical ligation methods, or polymerase mediated methods) to form 
5 essentially any desired continuous sequence. For example, polynucleotides and 

oligonucleotides of the invention can be prepared by chemical synthesis using, e.g., the 
classical phosphoramidite method described by Beaucage et al. (1981) Tetrahedron 
Letters 22:1859-69, or the method described by Matthes et al. (1984) EMBO J. 3: 801-05., 
e.g., as is typically practiced in automated synthetic methods. According to the 
10 phosphoramidite method, oligonucleotides are synthesized, e.g., in an automatic DNA 
synthesizer, purified, annealed, ligated and cloned in appropriate vectors. 

In addition, essentially any nucleic acid can be custom ordered from any of 
a variety of commercial sources, such as The Midland Certified Reagent Company 
(mcrc@oligos.com), The Great American Gene Company (http://www.genco.com), 
15 ExpressGen Lie. (www.expressgen.com), Operon Technologies Inc. (Alameda, CA) and 
many others. Similarly, peptides and antibodies can be custom ordered from any of a 
variety of sources, such as PeptidoGenic (pkim@ccnet.com), HTI Bio-products, Inc. 
(http://www.htibio.com), BMA Biomedicals Ltd (U.K.), Bio.Synthesis, Inc., and many 
others. 

20 Polynucleotides may also be synthesized by well-known techniques as 

described in the technical literature. See, e.g., Carruthers et al, Cold Spring Harbor 
Symp. Quant Biol 47:411-418 (1982), and Adams et al, J. Am. Chem. Soc. 105:661 
(1983). Double stranded DNA fragments may then be obtained either by synthesizing the 
complementary strand and annealing the strands together under appropriate conditions, or 

25 by adding the complementary strand using DNA polymerase with an appropriate primer 
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sufficient to direct persons of skill through in vitro amplification methods, including the 
polymerase chain reaction (PCR) the ligase chain reaction (LCR), Qp-replicase 
amplification and other RNA polymerase mediated techniques (e.g., NASBA) are found in 
Berger, Sambrook, and Ausubel, as well as Mullis et al, (1987) U.S. Patent No. 
4,683,202; PCR Protocols A Guide to Methods and Applications (Innis et al. 9 eds.) 
Academic Press Inc. San Diego, CA (1990); Arnheim & Levinson (October 1, 1990) 
Chemical and Engineering News 36-47; The Journal Of NIH Research (1991) 3:81-94; 
Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86:1173; Guatelli et al (1990) Proc. Natl. 
Acad. Sci. USA 87:1874; Lomell et al (1989) J. Clin. Chem. 35:1826; Landegren et al. 7 
(1988) Science 241:1077-1080; Van Brunt (1990) Biotechnology 8:291-294; Wu and 
Wallace, (1989) Gene 4:560; Barringer et al. (1990) Gene 89:117, and Sooknanan and 
Malek (1995) Biotechnology 13:563-564. Improved methods of cloning in vitro amplified 
nucleic acids are described in Wallace et al, U.S. Pat. No. 5,426,039. Improved methods 
of amplifying large nucleic acids by PCR are summarized in Cheng et al. (1994) Nature 
369:684-685 and the references therein, in which PCR amplicons of up to 40kb are 
generated. One of skill will appreciate that essentially any RNA can be converted into a 
double stranded DNA suitable for restriction digestion, PCR expansion and sequencing 
using reverse transcriptase and a polymerase. See, Ausbel, Sambrook and Berger, all 
supra. 

Sequence Variations 

It will be appreciated by those skilled in the art that due to the degeneracy 
of the genetic code, a multitude of nucleotide sequences encoding GAT polypeptides of 
the invention may be produced, some of which bear substantial identity to the nucleic acid 
sequences explicitly disclosed herein. 
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Table 1 
Codon Table 



Amino acids 


Codon 


Alanine 


Ala 


A 
A 
















w wry 

L*ys 






TTpT T 

UvjtU 










Aspartic acid 


Asp 


xJ 




Pi ATT 










Glutamic acid 


U1U 


±1 


Pi A A 

LtAA 


Pi A Pi 










Phenylalanine 


r'ne 


Jr 


uut 


T TT TT T 

uuu 












vJiy 


VJ 


CrCx A 
VJvXrV 




VJVJvJ 


VJ VJ vJ 






TTi uridine 

X XI O UXUXIXts 


His 




CAP 


CAIJ 










Isoleucine 


He 


I 


AUA 


AUC 


AUU 








Lysine 


Lys 


K 


AAA 


AAG 










Leucine 


Leu 


L 


UUA 


UUG 


CUA 


cue 


CUG 


CUU 


Methionine 


Met 


M 


AUG 












Asparagine 


Asn 


N 


AAC 


AAU 










Proline 


Pro 


P 


CCA 


ccc 


CCG 


ecu 






Glutainine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGU 


Serine 


Ser 


S 


AGC 


AGU 


UCA 


ucc 


UCG 


UCU 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 






Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 






Tryptophan 


Trp 


w 


UGG 












Tyrosine 


Tyr 


Y 


UAC 


UAU 











5 For instance, inspection of the codon table (Table 1) shows that 

codons AGA, AGG, CGA, CGC, CGG, and CGU all encode the amino acid arginine. 
Thus, at every position in the nucleic acids of the invention where an arginine is specified 
by a codon, the codon can be altered to any of the corresponding codons described above 
without altering the encoded polypeptide. It is understood that U in an RNA sequence 

10 corresponds to T in a DNA sequence. 

Using, as an example, the nucleic acid sequence corresponding to 
nucleotides 1-15 of SEQ ID NO:l, ATG ATT GAA GTC AAA, a silent variation of this 
sequence includes AGT ATC GAG GTG AAG, both sequences which encode the amino 
acid sequence MDBVK, corresponding to amino acids 1-5 of SEQ ID NO:6. 

15 Such "silent variations" are one species of "conservatively modified 

variations", discussed below. One of skill will recognize that each codon in a nucleic acid 
(except AUG, which is ordinarily the only codon for methionine) can be modified by 
standard techniques to encode a functionally identical polypeptide. Accordingly, each 
silent variation of a nucleic acid which encodes a polypeptide is implicit in any described 

20 sequence. The invention provides each and every possible variation of nucleic acid 
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sequence encoding a polypeptide of the invention that could be made by selecting 
combinations based on possible codon choices. These combinations are made in 
accordance with the standard triplet genetic code (e.g., as set forth in Table 1) as applied to 
the nucleic acid sequence encoding a GAT homologue polypeptide of the invention. All 

5 such variations of every nucleic acid herein are specifically provided and described by 
consideration of the sequence in combination with the genetic code. Any variant can be 
produced as noted herein. 

A group of two or more different codons that, when translated in the same 
context, all encode the same amino acid, are referred to herein as "synonoumous codons." 
10 As described herein, in some aspects of the invention a GAT polynucleotide is engineered 
for optimized codon usage in a desired host organism, for example a plant host. The term 
"optimized" or "optimal" are not meant to be restricted to the very best possible 
combination of codons, but simple indicates that the coding sequence as a whole possesses 
an improved usage of codons relative to a precursor polynucleotide from which it was 

15 derived. Thus, in one aspect the invention provides a method for producing a GAT 

polynucleotide variant by replacing at least one parental codon in a nucleotide sequence 
with a synonomous codon that is preferentially used in a desired host organism, e.g., a 
plant, relative to the parental codon. 

"Conservatively modified variations" or, simply, "conservative variations" 

20 of a particular nucleic acid sequence refers to those nucleic acids which encode identical 
or essentially identical amino acid sequences, or, where the nucleic acid does not encode 
an amino acid sequence, to essentially identical sequences. One of skill will recognize 
that individual substitutions, deletions or additions which alter, add or delete a single 
amino acid or a small percentage of amino acids (typically less than 5%, more typically 

25 less than 4%, 2% or 1%, or less) in an encoded sequence are "conservatively modified 
variations" where the alterations result in the deletion of an amino acid, addition of an 
amino acid, or substitution of an amino acid with a chemically similar amino acid. 

Conservative substitution tables providing functionally similar amino acids 
are well known in the art. Table 2 sets forth six groups which contain amino acids that are 

30 "conservative substitutions" for one another. 
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Table 2 

Conservative Substitution Groups 



1 


Alanine (A) 


Serine (S) 


Threonine (T) 


2 


Aspartic acid (D) 


Glutamic acid (E) 




3 


Asparagine (N) 


Glutamine (Q) 




4 


Arginine (R) 


Lysine (K) 




5 


Isoleucine (I) 


Leucine (L) 


Methionine (M) Valine (V) 


6 


Phenylalanine (F) 


Tyrosine (Y) 


Tryptophan (W) 



5 Thus, "conservatively substituted variations" of a listed polypeptide 

sequence of the present invention include substitutions of a small percentage, typically less 
than 5%, more typically less than 2% and often less than 1%, of the amino acids of the 
polypeptide sequence, with a conservatively selected amino acid of the same conservative 
substitution group. 

10 For example, a conservatively substituted variation of the polypeptide 

identified herein as SEQ ID NO:6 will contain "conservative substitutions", according to 
the six groups defined above, in up to 7 residues (i.e., 5% of the amino acids) in the 146 
amino acid polypeptide. 

In a further example, if four conservative substitutions were localized in 
15 the region corresponding to amino acids 21 to 30 of SEQ ID NO:6, examples of 
conservatively substituted variations of this region, 
RPN QPL EAC M, include: 
KPQ QP V ESC M and 

KPN NPL DAC V and the like, in accordance with the conservative substitutions 
20 listed in Table 2 (in the above example, conservative substitutions are underlined). Listing 
of a protein sequence herein, in conjunction with the above substitution table, provides an 
express listing of all conservatively substituted proteins. 

Finally, the addition of sequences which do not alter the encoded activity of 
a nucleic acid molecule, such as the addition of a non-functional or non-coding sequence, 
25 is a conservative variation of the basic nucleic acid. 

One of skill will appreciate that many conservative variations of the nucleic 
acid constructs which are disclosed yield a functionally identical construct. For example, 
as discussed above, owing to the degeneracy of the genetic code, "silent substitutions" 



-43- 



WO 02/36782 



PCT/US01/46227 



(i.e., substitutions in a nucleic acid sequence which do not result in an alteration in an 
encoded polypeptide) are an implied feature of every nucleic acid sequence which encodes 
an amino acid. Similarly, "conservative amino acid substitutions ," in one or a few amino 
acids in an amino acid sequence are substituted with different amino acids with highly 
similar properties, are also readily identified as being highly similar to a disclosed 
construct. Such conservative variations of each disclosed sequence are a feature of the 
present invention. 

Non-conservative modifications of a particular nucleic acid are those which 
substitute any amino acid not characterized as a conservative substitution. For example, 
any substitution which crosses the bounds of the six groups set forth in Table 2. These 
include substitutions of basic or acidic amino acids for neutral amino acids, (e.g., Asp, 
Glu, Asn, or Gin for Val, lie, Leu or Met), aromatic amino acid for basic or acidic amino 
acids (e.g., Phe, Tyr or Trp for Asp, Asn, Glu or Gin) or any other substitution not 
replacing an amino acid with a like amino acid. 

Nucleic Acid Hybridization 

Nucleic acids "hybridize" when they associate, typically in solution. 
Nucleic acids hybridize due to a variety of well-characterized physico-chemical forces, 
such as hydrogen bonding, solvent exclusion, base stacking and the like. An extensive 
guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory 
Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid 
Probes, part I, chapter 2, "Overview of principles of hybridization and the strategy of 
nucleic acid probe assays," (Elsevier, New York), as well as in Ausubel, supra, Hames 
and Higgins (1995) Gene Probes 1, IRL Press at Oxford University Press, Oxford, 
England (Hames and Higgins 1) and Hames and Higgins (1995) Gene Probes 2, IRL Press 
at Oxford University Press, Oxford, England (Hames and Higgins 2) provide details on the 
synthesis, labeling, detection and quantification of DNA and RNA, including 
oligonucleotides. 

"Stringent hybridization wash conditions" in the context of nucleic acid 
hybridization experiments, such as Southern and northern hybridizations, are sequence 
dependent, and are different under different environmental parameters. An extensive 
guide to the hybridization of nucleic acids is found in Tijssen (1993), supra, and in Hames 
and Higgins 1 and Hames and Higgins 2, supra. 

For purposes of the present invention, generally, "highly stringent" 

hybridization and wash conditions are selected to be about 5°C or less lower than the 
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thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH (as 
noted below, highly stringent conditions can also be referred to in comparative terms). 
The T m is the temperature (under defined ionic strength and pH) at which 50% of the test 
sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected 
to be equal to the T m for a particular probe. 

The T m of a nucleic acid duplex indicates the temperature at which the 
duplex is 50% denatured under the given conditions and its represents a direct measure of 
the stability of the nucleic acid hybrid. Thus, the T m corresponds to the temperature 
corresponding to the midpoint in transition from helix to random coil; it depends on 
length, nucleotide composition, and ionic strength for long stretches of nucleotides. 

After hybridization, unhybridized nucleic acid material can be removed by 
a series of washes, the stringency of which can be adjusted depending upon the desired 
results. Low stringency washing conditions (e.g., using higher salt and lower temperature) 
increase sensitivity, but can product nonspecific hybridization signals and high 
background signals. Higher stringency conditions (e.g., using lower salt and higher 
temperature that is closer to the hybridization temperature) lowers the background signal, 
typically with only the specific signal remaining. See Rapley, R. and Walker, J.M. eds., 
Molecular Biomethods Handbook (Humana Press, Inc. 1998) (hereinafter "Rapley and 
Walker"), which is incorporated herein by reference in its entirety for all purposes. 

The T m of a DNA-DNA duplex can be estimated using Equation 1 as 

follows: 

T m (°C) = 8L5°C + 16.6 (logioM) + 0.41 (%G + C) - 0.72 (%f) - 500/n, 
where M is the molarity of the monovalent cations (usually Na+), (%G + 
C) is the percentage of guanosine (G) and cystosine (C) nucleotides, (%f) is the percentage 
of formalize and n is the number of nucleotide bases (i.e. , length) of the hybrid. See 
Rapley and Walker, supra. 

The T m of an RNA-DNA duplex can be estimated by using Equation 2 as 

follows: 

T m (°C) = 79.8°C + 18.5 (log ia M) + 0.58 (%G + C) - 11.8(%G + C) 2 - 0.56 
(%f) _ 820/n,where M is the molarity of the monovalent cations (usually Na+), (%G + 
C)is the percentage of guanosine (G ) and cystosine (C) nucleotides, (%f) is the percentage 
of formamide and n is the number of nucleotide bases (i.e., length) of the hybrid. Id. 

Equations 1 and 2 are typically accurate only for hybrid duplexes longer 
than about 100-200 nucleotides, ta 
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The Tm of nucleic acid sequences shorter than 50 nucleotides can be 
calculated as follows: 

T m (°C) = 4(G + C) + 2(A + T), 

where A (adenine), C, T (thymine), and G are the numbers of the 
5 corresponding nucleotides. 

An example of stringent hybridization conditions for hybridization of 
complementary nucleic acids which have more than 100 complementary residues on a 
filter in a Southern or northern blot is 50% formalin with 1 mg of heparin at 42°C, with 
the hybridization being carried out overnight. An example of stringent wash conditions is 
10 a 0.2x SSC wash at 65°C for 15 minutes (see Sambrook, supra for a description of SSC 
buffer). Often the high stringency wash is preceded by a low stringency wash to remove 
background probe signal. An example low stringency wash is 2x SSC at 40°C for 15 
minutes. 

In general, a signal to noise ratio of 2.5x-5x (or higher) than that observed 
15 for an unrelated probe in the particular hybridization assay indicates detection of a specific 
hybridization. Detection of at least stringent hybridization between two sequences in the 
context of the present invention indicates relatively strong structural similarity or 
homology to, e.g., the nucleic acids of the present invention provided in the sequence 
listings herein. 

20 As noted, "highly stringent" conditions are selected to be about 5° C or less 

lower than the thermal melting point (Tm) for the specific sequence at a defined ionic 
strength and pH. Target sequences that are closely related or identical to the nucleotide 
sequence of interest (e.g., "probe") can be identified under highly stringent conditions. 
Lower stringency conditions are appropriate for sequences that are less complementary. 

25 See, e.g. , Rapley and Walker, supra. 

Comparative hybridization can be used to identify nucleic acids of the 
invention, and this comparative hybridization method is a preferred method of 
distinguishing nucleic acids of the invention. Detection of highly stringent hybridization 
between two nucleotide sequences in the context of the present invention indicates 

30 relatively strong structural similarity/homology to, e.g., the nucleic acids provided in the 
sequence listing herein. Highly stringent hybridization between two nucleotide sequences 
demonstrates a degree of similarity or homology of structure, nucleotide base composition, 
arrangement or order that is greater than that detected by stringent hybridization 
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conditions. In particular, detection of highly stringent hybridization in the context of the 
present invention indicates strong structural similarity or structural homology (e.g., 
nucleotide structure, base composition, arrangement or order) to, e.g., the nucleic acids 
provided in the sequence listings herein. For example, it is desirable to identify test 
nucleic acids that hybridize to the exemplar nucleic acids herein under stringent 
conditions. 

Thus, one measure of stringent hybridization is the ability to hybridize to 
one of the listed nucleic acids (e.g., nucleic acid sequences SEQ ID NO:l to SEQ ID NO: 5 
and SEQ ID NO:l 1 to SEQ ID NO:262, and complementary polynucleotide sequences 
thereof), under highly stringent conditions (or very stringent conditions, or ultra-high 
stringency hybridization conditions, or ultra-ultra high stringency hybridization 
conditions). Stringent hybridization (as well as highly stringent, ultra-high stringency, or 
ultra-ultra high stringency hybridization conditions) and wash conditions can easily be 
determined empirically for any test nucleic acid. For example, in determining highly 
stringent hybridization and wash conditions, the hybridization and wash conditions are 
gradually increased (e.g., by increasing temperature, decreasing salt concentration, 
increasing detergent concentration and/or increasing the concentration of organic solvents, 
such as formalin, in the hybridization or wash), until a selected set of criteria are met. For 
example, the hybridization and wash conditions are gradually increased until a probe 
comprising one or more nucleic acid sequences selected from SEQ ID NO:l to SEQ ID 
NO:5 and SEQ ID NO: 11 to SEQ ID NO:262, and complementary polynucleotide 
sequences thereof, binds to a perfectly matched complementary target (again, a nucleic 
acid comprising one or more nucleic acid sequences selected from SEQ ID NO: 1 to SEQ 
ID NO:5 and SEQ ID NO: 11 to SEQ ID NO:262, and complementary polynucleotide 
sequences thereof), with a signal to noise ratio that is at least about 2.5x, and optionally 
about 5x or more as high as that observed for hybridization of the probe to an unmatched 
target. In this case, the unmatched target is a nucleic acid corresponding to a nucleic acid 
(other than those in the accompanying sequence listing) that is present in a public database 
such as GenBank™ at the time of filing of the subject application. Such sequences can be 
identified in GenBank by one of skill. Examples include Accession Nos. Z99109 and 
Y09476. Additional such sequences can be identified in e.g., GenBank, by one of 
ordinary skill in the art. 

A test nucleic acid is said to specifically hybridize to a probe nucleic acid 

when it hybridizes at least Vi as well to the probe as to the perfectly matched 
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complementary target, i.e., with a signal to noise ratio at least V4 as high as hybridization 
of the probe to the target under conditions in which the perfectly matched probe binds to 
the perfectly matched complementary target with a signal to noise ratio that is at least 
about 2x-10x, and occasionally 20x, 50x or greater than that observed for hybridization to 
any of the unmatched polynucleotides Accession Nos. Z99109 and Y09476. 

Ultra high-stringency hybridization and wash conditions are those in which 
the stringency of hybridization and wash conditions axe increased until the signal to noise 
ratio for binding of the probe to the perfectly matched complementary target nucleic acid 
is at least lOx as high as that observed for hybridization to any of the unmatched target 
nucleic acids Genbank Accession numbers Z99109 and Y09476. A target nucleic acid 
which hybridizes to a probe under such conditions, with a signal to noise ratio of at least Vi 
that of the perfectly matched complementary target nucleic acid is said to bind to the probe 
under ultra-high stringency conditions. 

Similarly, even higher levels of stringency can be determined by gradually 
increasing the hybridization and/or wash conditions of the relevant hybridization assay. 
For example, those in which the stringency of hybridization and wash conditions are 
increased until the signal to noise ratio for binding of the probe to the perfectly matched 
complementary target nucleic acid is at least lOx, 20X, 50X, 100X, or 500X or more as 
high as that observed for hybridization to any of the unmatched target nucleic acids 
Genbank Accession numbers Z99109 and Y09476. A target nucleic acid which hybridizes 
to a probe under such conditions, with a signal to noise ratio of at least Vz that of the 
perfectly matched complementary target nucleic acid is said to bind to the probe under 
ultra-ultra-high stringency conditions. 

Target nucleic acids which hybridize to the nucleic acids represented by 
SEQ ID NO: 1 to SEQ ID NO:5 and SEQ ID NO: 1 1 to SEQ ID NO:262 under high, ultra- 
high and ultra-ultra high stringency conditions are a feature of the invention. Examples of 
such nucleic acids include those with one or a few silent or conservative nucleic acid 
substitutions as compared to a given nucleic acid sequence. 

Nucleic acids which do not hybridize to each other under stringent 
conditions are still substantially identical if the polypeptides which they encode are 
substantially identical. This occurs, e.g., when a copy of a nucleic acid is created using the 
maximum codon degeneracy permitted by the genetic code, or when antisera or antiserum 
generated against one or more of SEQ ID NO:6 to SEQ ID NO:10 and SEQ ID NO:263 to 
SEQ ID NO: 5 14, which has been subtracted using the polypeptides encoded by known 
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nucleotide sequences, including Genbank Accession number CAA70664. Further details 
on immunological identification of polypeptides of the invention are found below. 
Additionally, for distinguishing between duplexes with sequences of less than about 100 
nucleotides, a TMAC1 hybridization procedure known to those of ordinary skill in the art 
5 can be used. See, e.g., Sorg, U. et al. 1 Nucleic Acids Res. (Sept. 11, 1991) 19(17), 
incorporated herein by reference in its entirety for all purposes. 

In one aspect, the invention provides a nucleic acid which comprises a 
unique subsequence in a nucleic acid selected from SEQ ID NO:l to SEQ ID NO:5 and 
SEQ ID NO: 11 to SEQ ID NO:262. The unique subsequence is unique as compared to a 
10 nucleic acid corresponding to any of Genbank Accession numbers Z99109 and Y09476. 
Such unique subsequences can be determined by aligning any of SEQ ID NO:l to SEQ ID 
NO:5 and SEQ ID NO: 11 to SEQ ID NO:262 against the complete set of nucleic acids 
represented by GenBank accession numbers Z99109, Y09476 or other related sequences 
available in public databases as of the filing date of the subject application. Alignment 
15 can be performed using the BLAST algorithm set to default parameters. Any unique 
subsequence is useful, e.g., as a probe to identify the nucleic acids of the invention. 

Similarly, the invention includes a polypeptide which comprises a unique 
subsequence in a polypeptide selected from: SEQ ID NO:6 to SEQ ID NO: 10 and SEQ ID 
NO:263 to SEQ ID NO: 5 14. Here, the unique subsequence is unique as compared to a 
20 polypeptide corresponding to GenBank accession number CAA70664. Here again, the 

polypeptide is aligned against the sequences represented by accession number CAA70664. 
Note that if the sequence corresponds to a non-translated sequence such as a pseudo gene, 
the corresponding polypeptide is generated simply by in silico translation of the nucleic 
acid sequence into an amino acid sequence, where the reading frame is selected to 
25 correspond to the reading frame of homologous GAT polynucleotides. 

The invention also provides for target nucleic acids which hybridizes under 
stringent conditions to a unique coding oligonucleotide which encodes a unique 
subsequence in a polypeptide selected from SEQ ID NO:6 to SEQ ID NO: 10 and SEQ ID 
NO:263 to SEQ ID NO:514, wherein the unique subsequence is unique as compared to a 
30 polypeptide corresponding to any of the control polypeptides. Unique sequences are 
determined as noted above. 

In one example, the stringent conditions are selected such that a perfectly 
complementary oligonucleotide to the coding oligonucleotide hybridizes to the coding 

oligonucleotide with at least about a 2.5x-10x higher, preferably at least about a 5-10x 
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higher signal to noise ratio than for hybridization of the perfectly complementary 
oligonucleotide to a control nucleic acid corresponding to any of the control polypeptides. 
Conditions can be selected such that higher ratios of signal to noise are observed in the 
particular assay which is used, e.g., about 15x, 20x, 30x, 50x or more. In this example, the 
target nucleic acid hybridizes to the unique coding oligonucleotide with at least a 2x 
higher signal to noise ratio as compared to hybridization of the control nucleic acid to the 
coding oligonucleotide. Again, higher signal to noise ratios can be selected, e.g., about 
2.5x, 5x, lOx, 20x, 30x, 50x or more. The particular signal will depend on the label used 
in the relevant assay, e.g., a fluorescent label, a colorimetric label, a radioactive label, or 
the like. 

Vectors. Promoters and Expression Systems, 

The present invention also includes recombinant constructs comprising one 
or more of the nucleic acid sequences as broadly described above. The constructs 
comprise a vector, such as, a plasmid, a cosmid, a phage, a virus, a bacterial artificial 
chromosome (BAC), a yeast artificial chromosome (YAC), or the like, into which a 
nucleic acid sequence of the invention has been inserted, in a forward or reverse 
orientation. In a preferred aspect of this embodiment, the construct further comprises 
regulatory sequences, including, for example, a promoter, operably linked to the sequence. 
Large numbers of suitable vectors and promoters are known to those of skill in the art, and 
are commercially available. 

General texts which describe molecular biological techniques useful herein, 

including the use of vectors, promoters and many other relevant topics, include Berger and 

Kimmel, Guide to Molecular Cloning Techniques. Methods in Enzvmology volume 152 

Academic Press, Inc., San Diego, CA (Berger); Sambrook et al., Molecular Cloning - A 

Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring 

Harbor, New York, 1989 ("Sambrook") and Current Protocols in Molecular Biology , F.M. 

Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing 

Associates, Inc. and John Wiley & Sons, Inc., (supplemented through 1999) ("Ausubel"). 

Examples of protocols sufficient to direct persons of skill through in vitro amplification 

methods, including the polymerase chain reaction (PGR) the ligase chain reaction (LCR), 

Qp-replicase amplification and other RNA polymerase mediated techniques (e.g., 

NASBA), e.g., for the production of the homologous nucleic acids of the invention are 

found in Berger, Sambrook, and Ausubel, as well as Mullis et al., (1987) U.S. Patent No. 

4,683,202; PCR Protocols A Guide to Methods and Applications (Innis et al. eds) 
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Academic Press Inc. San Diego, CA (1990) (Innis); Arnheim & Levinson (October 1, 
1990) C&EN 36-47; The Journal Of NIH Research (1991) 3, 81-94; (Kwoh et aL (1989) 
Proc. Natl. Acad. Sci. USA 86, 1173; Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87, 
1874; Lomell et al. (1989) J. Clin. Chem 35. 1826; Landegren et al., (1988) Science 241, 
1077-1080; Van Brunt (1990) Biotechnology 8, 291-294; Wu and Wallace, (1989) Gene 4, 
560; Barringer et al. (1990) Gene 89, 117, and Sooknanan and Malek (1995) 
Biotechnology 13: 563-564. Improved methods for cloning in vitro amplified nucleic 
acids are described in Wallace et ah, U.S. Pat. No. 5,426,039. Improved methods for 
amplifying large nucleic acids by PCR are summarized in Cheng et al. (1994) Nature 369: 
684-685 and the references cited therein, in which PCR amplicons of up to 40kb are 
generated. One of skill will appreciate that essentially any RNA can be converted into a 
double stranded DNA suitable for restriction digestion, PCR expansion and sequencing 
using reverse transcriptase and a polymerase. See, e.g., Ausubel, Sambrook and Berger, 
all supra. 

The present invention also relates to engineered host cells that are 
transduced (transformed or transfected) with a vector of the invention (e.g., an invention 
cloning vector or an invention expression vector), as well as the production of 
polypeptides of the invention by recombinant techniques. The vector may be, for 
example, a plasmid, a viral particle, a phage, etc. The engineered host cells can be 
cultured in conventional nutrient media modified as appropriate for activating promoters, 
selecting transformants, or amplifying the GAT homologue gene. Culture conditions, 
such as temperature, pH and the like, are those previously used with the host cell selected 
for expression, and will be apparent to those skilled in the art and in the references cited 
herein, including, e.g., Sambrook, Ausubel and Berger, as well as e.g., Freshney (1994) 
Culture of Animal Cells, a Manual of Basic Technique , third edition, Wiley- Liss, New 
York and the references cited therein. 

GAT polypeptides of the invention can be produced in non-animal cells 
such as plants, yeast, fungi, bacteria and the like. In addition to Sambrook, Berger and 
Ausubel, details regarding non-animal cell culture can be found in Payne et al. (1992) 
Plant Cell and Tissue Culture in Liquid Systems John Wiley & Sons, Inc. New York, NY; 
Gamborg and Phillips (eds) (1995) Plant Cell, Tissue and Organ Culture : Fundamental 
Methods Springer Lab Manual, Springer- Verlag (Berlin Heidelberg New York) and Atlas 
and Parks (eds) The Handbook of Microbiological Media (1993) CRC Press, Boca Raton, 
FL. 
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Polynucleotides of the present invention can be incorporated into any one 
of a variety of expression vectors suitable for expressing a polypeptide. Suitable vectors 
include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., derivatives of 
SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from 
combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl 
pox virus, pseudorabies, adenovirus, adeno-associated virus, retroviruses and many others. 
Any vector that transduces genetic material into a cell, and, if replication is desired, which 
is replicable and viable in the relevant host can be used. 

When incorporated into an expression vector, a polynucleotide of the 
invention is operatively linked to an appropriate transcription control sequence (promoter) 
to direct mRNA synthesis. Examples of such transcription control sequences particularly 
suited for use in transgenic plants include the cauliflower mosaic virus (CaMV), figwort 
mosaic virus (FMV) and strawberry vein banding virus (SVBV) promoters, described in 
U.S. Provisional Application No. 60/245,354. Other promoters known to control 
expression of genes in prokaryotic or eukaryotic cells or their viruses and which can be 
used in some embodiments of the invention include S V40 promoter, E. coli lac or trp 
promoter, phage lambda P L promoter. An expression vector optionally contains a 
ribosome binding site for translation initiation, and a transcription terminator. The vector 
also optionally includes appropriate sequences for amplifying expression, e.g., an 
enhancer. In addition, the expression vectors of the present invention optionally contain 
one or more selectable marker genes to provide a phenotypic trait for selection of 
transformed host cells, such as dihydrofolate reductase or neomycin resistance for 
eukaryotic cell culture, or such as tetracycline or ampicillin resistance in E. coli. 

Vectors of the present invention can be employed to transform an 
appropriate host to permit the host to express an invention protein or polypeptide. 
Examples of appropriate expression hosts include: bacterial cells, such as E. coli, B. 
subtilis, Streptomyces, and Salmonella typhimurium; fungal cells, such as Saccharomyces 
cerevisiae, Pichia pastoris, and Neurospora crassa\ insect cells such as Drosophila and 
Spodoptera frugiperda; mammalian cells such as CHO, COS, BHK, HEK 293 or Bowes 
melanoma; or plant cells or explants, etc. It is understood that not all cells or cell lines 
need to be capable of producing fully functional GAT polypeptides; for example, antigenic 
fragments of a GAT polypeptide may be produced. The invention is not limited by the 
host cells employed. 
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In bacterial systems, a number of expression vectors may be selected 
depending upon the use intended for the GAT polypeptide. For example, when large 
quantities of GAT polypeptide or fragments thereof are needed for commercial production 
or for induction of antibodies, vectors which direct high level expression of fusion proteins 
that are readily purified can be desirable. Such vectors include, but are not limited to, 
multifunctional E. coli cloning and expression vectors such as BLUESCRIPT 
(Stratagene), in which the GAT polypeptide coding sequence may be ligated into the 
vector in-frame with sequences for the amino-terminal Met and the subsequent 7 residues 
of beta-galactosidase so that a hybrid protein is produced; pIN vectors (Van Heeke & 
Schuster (1989) J Biol Chem 2 64:5503-5509); pET vectors (Novagen, Madison WI); and 
the like. 

Similarly, in the yeast Saccharomyces cerevisiae a number of vectors 
containing constitutive or inducible promoters such as alpha factor, alcohol oxidase and 
PGH may be used for production of the GAT polypeptides of the invention. For reviews, 
see Ausubel et al. {supra) and Grant et al. (1987; Methods inEnzvmology 153:516-544). 

In mammalian host cells, a variety of expression systems, including viral- 
based systems, may be utilized. In cases where an adenovirus is used as an expression 
vector, a coding sequence, e.g., of a GAT polypeptide, is optionally ligated into an 
adenovirus transcription/translation complex consisting of the late promoter and tripartite 
leader sequence. Insertion of a GAT polypeptide coding region into a nonessential El or 
E3 region of the viral genome will result in a viable virus capable of expressing a GAT in 
infected host cells (Logan and Shenk (1984) Prnc Natl Acad Sci USA 81:3655-3659). In 
addition, transcription enhancers, such as the rous sarcoma virus (RSV) enhancer, may be 
used to increase expression in mammalian host cells. 

Similarly, in plant cells, expression can be driven from a transgene 
integrated into a plant chromosome, or cytoplasmically from an episomal or viral nucleic 
acid. In the case of stably integrated transgenes, it is often desirable to provide sequences 
capable of driving constitutive or inducible expression of the GAT polynucleotides of the 
invention, for example, using viral, e.g., CaMV, or plant derived regulatory sequences. 
Numerous plant derived regulatory sequences have been described, including sequences 
which direct expression in a tissue specific manner, e.g., TobRB7, patatin B33, GRP gene 
promoters, the rbcS-3 A promoter, and the like. Alternatively, high level expression can be 
achieved by transiently expressing exogenous sequences of a plant viral vector, e.g., TMV, 
BMV, etc. Typically, transgenic plants constitutively expressing a GAT polynucleotide of 
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the invention will be preferred, and the regulatory sequences selected to insure constitutive 
stable expression of the GAT polypeptide. 

In some embodiments of the present invention, a GAT polynucleotide 
construct suitable for transformation of plant cells is prepared. For example, a desired 
GAT polynucleotide can be incorporated into a recombinant expression cassette to 
facilitate introduction of the gene into a plant and subsequent expression of the encoded 
polypeptide. An expression cassette will typically comprise a GAT polynucleotide, or 
functional fragment thereof, operably linked to a promoter sequence and other 
transcriptional and translational initiation regulatory sequences which will direct 
expression of the sequence in the intended tissues (e.g., entire plant, leaves, seeds) of the 
transformed plant. 

For example, a strongly or weakly constitutive plant promoter can be 
employed which will direct expression of the GAT polypeptide all tissues of a plant. Such 
promoters are active under most environmental conditions and states of development or 
cell differentiation. Examples of constitutive promoters include the 1- or 2- promoter 
derived from T-DNA of Agrobacterium tumefaciens, and other transcription initiation 
regions from various plant genes known to those of skill. In situations in which 
overexpression of a GAT poynucleotide is detrimental to the plant or otherwise 
undesirable, one of skill, upon review of this disclosure, will recognize that weak 
constitutive promoters can be used for low-levels of expression. In those cases where high 
levels of expression is not harmful to the plant, a strong promoter, e.g., a t-RNA or other 
pol HI promoter, or a strong pol II promoter, such as the cauliflower mosaic virus 
promoter, can be used. 

Alternatively, a plant promoter may be under environmental control. Such 
promoters are referred to here as "inducible" promoters. Examples of environmental 
conditions that may effect transcription by inducible promoters include pathogen attack, 
anaerobic conditions, or the presence of light. 

The promoters used in the present invention can be "tissue-specific" and, as 
such, under developmental control in that the polynucleotide is expressed only in certain 
tissues, such as leaves and seeds. In embodiments in which one or more nucleic acid 
sequences endogenous to the plant system are incorporated into the construct, the 
endogenous promoters (or variants thereof) from these genes can be employed for 
directing expression of the genes in the transfected plant. Tissue-specific promoters can 

also be used to direct expression of heterologous polynucleotides. 
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In general, the particular promoter used in the expression cassette in plants 

depends on the intended application. Any of a number of promoters which direct 

transcription in plant cells are suitable. The promoter can be either constitutive or 

inducible. In addition to the promoters noted above, promoters of bacterial origin which 

operate in plants include the octopine synthase promoter, the nopaline synthase promoter 

and other promoters derived from native Ti plasmids {see, Herrara-Estrella et al (1983) 

Nature 303:209-213). Viral promoters include the 35S and 19S RNA promoters of 

cauliflower mosaic virus (Odell et al (1985) Nature 313:810-812). Other plant promoters 

include the ribulose-l,3-bisphosphate carboxylase small subunit promoter and the 

phaseolin promoter. The promoter sequence from the E8 gene and other genes may also 

be used. The isolation and sequence of the E8 promoter is described in detail in Deikman 

and Fischer (1988) EMBO J. 7:3315-3327. 

To identify candidate promoters, the 5' portions of a genomic clone is 

analyzed for sequences characteristic of promoter sequences. For instance, promoter 

sequence elements include the TATA box consensus sequence (TATAAT), which is 

usually 20 to 30 base pairs upstream of the transcription start site. In plants, further 

upstream from the TATA box, at positions -80 to -100, there is typically a promoter 

element with a series of adenines surrounding the trinucleotide G (or T) as described by 

Messing et al (1983) Genetic Engineering in Plants , Kosage, et al (eds.), pp. 221-227. 

In preparing polyucleotide constructs, e.g., vectors, of the invention, 

sequences other than the promoter and the cojoined polynucleotide can also be employed. 

If normal polypeptide expression is desired, a polyadenylation region at the 3'-end of a 

GAT-encoding region can be included. The polyadenylation region can be derived, for 

example, from a variety of plant genes, or from T-DNA. 

The construct can also include a marker gene which confers a selectable 

phenotype on plant cells. For example, the marker may encode biocide tolerance, 

particularly antibiotic tolerance, such as tolerance to kanamycin, G418, bleomycin, 

hygromycin, or herbicide tolerance, such as tolerance to chlorosluforon, or 

phosphinothricin (the active ingredient in the herbicides bialaphos and Basta). 

Specific initiation signals can aid in efficient translation of a GAT 

polynucleotide-encoding sequence of the present invention. These signals can include, 

e.g., the ATG initiation codon and adjacent sequences. In cases where a GAT 

polypeptide-encoding sequence, its initiation codon and upstream sequences are inserted 

into an appropriate expression vector, no additional translational control signals may be 

- 55 - 



WO 02/36782 



PCT/US01/46227 



needed. However, in cases where only coding sequence (e.g., a mature protein coding 
sequence), or a portion thereof, is inserted, exogenous transcriptional control signals 
including the initiation codon must be provided. Furthermore, the initiation codon must be 
in the correct reading frame to ensure transcription of the entire insert. Exogenous 
transcriptional elements and initiation codons can be of various origins, both natural and 
synthetic. The efficiency of expression may be enhanced by the inclusion of enhancers 
appropriate to the cell system in use (Scharf D et al. (1994) Results Probl Cell Differ 
20:125-62; Bittner et al. (1987) Methods in Enzvmol 153:516^544). 

Secretion/Localization Sequences 

Polynucleotides of the invention can also be fused, for example, in-frame to 
nucleic acids encoding a secretion/localization sequence, to target polypeptide expression 
to a desired cellular compartment, membrane, or organelle of a mammalian cell, or to 
direct polypeptide secretion to the periplasmic space or into the cell culture media. Such 
sequences are known to those of skill, and include secretion leader peptides, organelle 
targeting sequences (e.g., nuclear localization sequences, ER retention signals, 
mitochondrial transit sequences, chloroplast transit sequences), membrane 
localization/anchor sequences (e.g., stop transfer sequences, GPI anchor sequences), and 
the like. 

In a preferred embodiment, a polynucleotide of the invention is fused in 
frame with an N-terminal chloroplast transit sequence (or chloroplast transit peptide 
sequence) derived from a gene encoding a polypeptide that is normally targeted to the 
chloroplast. Such sequences are typically rich in serine and threonine; are deficient in 
aspartate, glutamate, and tyrosine; and generally have a central domain rich in positively 
charged amino acids. 

Ex pression Hosts 

In a further embodiment, the present invention relates to host cells 
containing the above-described constructs. The host cell can be a eukaryotic cell, such as 
a mammalian cell, a yeast cell, or a plant cell, or the host cell can be a prokaryotic cell, 
such as a bacterial cell. Introduction of the construct into the host cell can be effected by 
calcium phosphate transfection, DEAE-Dextran mediated transfection, electroporation, or 
other common techniques (Davis, L., Dibner, M., and Battey, I. (1986) Basic Methods in 
Molecular Biology) . 

A host cell strain is optionally chosen for its ability to modulate the 
expression of the inserted sequences or to process the expressed protein in the desired 
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fashion. Such modifications of the protein include, but are not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. Post-translational 
processing that cleaves a "pre" or a "prepro" form of the protein may also be important for 
correct insertion, folding and/or function. Different host cells such as E. coli, Bacillus sp., 
yeast or mammalian cells such as CHO, HeLa, BHK, MDCK, 293, WI38, etc. have 
specific cellular machinery and characteristic mechanisms, e.g., for post-translational 
activities and may be chosen to ensure the desired modification and processing of the 
introduced, foreign protein. 

For long-term, high-yield production of recombinant proteins, stable 
expression systems can be used. For example, plant cells, explants or tissues, e.g. shoots, 
leaf discs, which stably express a polypeptide of the invention are transduced using 
expression vectors which contain viral origins of replication or endogenous expression 
elements and a selectable marker gene. Following the introduction of the vector, cells 
may be allowed to grow for a period determined to be appropriate for the cell type, e.g., 1 
or more hours for bacterial cells, 1-4 days for plant cells, 2-4 weeks for some plant 
explants, in an enriched media before they are switched to selective media. The purpose 
of the selectable marker is to confer resistance to selection, and its presence allows growth 
and recovery of cells which successfully express the introduced sequences. For example, 
transgenic plants expressing the polypeptides of the invention can be selected directly for 
resistance to the herbicide, glyphosate. Resistant embryos derived from stably 
transformed explants can be proliferated, e.g., using tissue culture techniques appropriate 
to the cell type. 

Host cells transformed with a nucleotide sequence encoding a polypeptide 
of the invention are optionally cultured under conditions suitable for the expression and 
recovery of the encoded protein from cell culture. The protein or fragment thereof 
produced by a recombinant cell may be secreted, membrane-bound, or contained 
intracellularly, depending on the sequence and/or the vector used. As will be understood 
by those of skill in the art, expression vectors containing GAT polynucleotides of the 
invention can be designed with signal sequences which direct secretion of the mature 
polypeptides through a prokaryotic or eukaryotic cell membrane. 

Additional Polypeptide Sequences 

Polynucleotides of the present invention may also comprise a coding 
sequence fused in-frame to a marker sequence that, e.g., facilitates purification of the 
encoded polypeptide. Such purification facilitating domains include, but are not limited 
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to, metal chelating peptides such as histidine-tryptophan modules that allow purification 
on immobilized metals, a sequence which binds glutathione (e.g., GST), a hemagglutinin 
(HA) tag (corresponding to an epitope derived from the influenza hemagglutinin protein; 
Wilson et al. (1984) Cell 37:767), maltose binding protein sequences, the FLAG epitope 
utilized in the FLAGS extension/affinity purification system (Immunex Corp, Seattle, 
WA), and the like. The inclusion of a protease-cleavable polypeptide linker sequence 
between the purification domain and the GAT homologue sequence is useful to facilitate 
purification. One expression vector contemplated for use in the compositions and methods 
described herein provides for expression of a fusion protein comprising a polypeptide of 
the invention fused to a polyhistidine region separated by an enterokinase cleavage site. 
The histidine residues facilitate purification on IMOLAC (immobilized metal ion affinity 
chromatography, as described in Porath et al. (1992) Protein Exp ression and Purification 
3:263-281) while the enterokinase cleavage site provides a means for separating the GAT 
homologue polypeptide from the fusion protein. pGEX vectors (Promega; Madison, WI) 
may also be used to express foreign polypeptides as fusion proteins with glutathione S- 
transferase (GST). In general, such fusion proteins are soluble and can easily be purified 
from lysed cells by adsorption to ligand-agarose beads (e.g., glutathione-agarose in the 
case of GST-fusions) followed by elution in the presence of free ligand. 

Polypeptide Production and Recovery 

Following transduction of a suitable host strain and growth of the host 
strain to an appropriate cell density, the selected promoter is induced by appropriate means 
(e.g., temperature shift or chemical induction) and cells are cultured for an additional 
period. Cells are typically harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting crude extract retained for further purification. Microbial cells 
employed in expression of proteins can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents, or 
other methods, which are well known to those skilled in the art. 

As noted, many references are available for the culture and production of 
many cells, including cells of bacterial, plant, animal (especially mammalian) and 
archebacterial origin. See e.g., Sambrook, Ausubel, and Berger {all supra), as well as 
Freshney (1994) Culture of Animal Cells, a Manual of Basic Technique , third edition, 
Wiley- Liss, New York and the references cited therein; Doyle and Griffiths (1997) 
Mammalian Cell Culture: Essential Techniques John Wiley and Sons, NY; Humason 
(1979) Animal Tissue Techniques , fourth edition W.H. Freeman and Company; and 
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Ricciardelli, et al., (1989) In vitro Cell Dev. Biol . 25:1016-1024. For plant cell culture 
and regeneration, Payne et al. (1992) Plant Cell and Tissue Culture in Liquid Systems 
John Wiley & Sons, Inc. New York, NY; Gamborg and Phillips (eds) (1995) Plant Cell. 
Tissue and Organ Culture ; Fundamental Methods Springer Lab Manual, Springer- Verlag 
5 (Berlin Heidelberg New York); Jones, ed. (1984) Plant Gene Transfer and Expression 
Protocols , Humana Press, Totowa, New Jersey and Plant Molecular Biolsv (1993) 
R.R.D.Croy, Ed. Bios Scientific Publishers, Oxford, U.K. ISBN 0 12 198370 6. Cell 
culture media in general are set forth in Atlas and Parks (eds) The Handbook of 
Microbiological Media (1993) CRC Press, Boca Raton, FL. Additional information for 
10 cell culture is found in available commercial literature such as the life Scien ce Research 
Cell Culture Catalogue (1998) from Sigma- Aldrich, Inc (St Louis, MO) ("Sigma- 
LSRCCC") and, e.g., The Plant Culture Catalogue and supplement (1997) also from 
Sigma- Aldrich, Inc (St Louis, MO) ("Sigma-PCCS"). Further details regarding plant cell 
transformation and transgenic plant production are found below. 
15 Polypeptides of the invention can be recovered and purified from 

recombinant cell cultures by any of a number of methods well known in the art, including 
ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange 
chromatography, phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography (e.g., using any of the tagging systems noted 
20 herein), hydroxylapatite chromatography, and lectin chromatography. Protein refolding 
steps can be used, as desired, in completing the configuration of the mature protein. 
Finally, high performance liquid chromatography (HPLC) can be employed in the final 
purification steps. In addition to the references noted supra, a variety of purification 
methods are well known in the art, including, e.g., those set forth in Sandana (1997) 
25 Bioseparation of Proteins , Academic Press, Inc.; and Bollag et al. (1996) Protein Methods, 
2 nd Edition Wiley-Iiss, NY; Walker (1996) The Protein P rotocols Handbook Humana 
Press, NJ, Harris and Angal (1990) Protein Purification App lications: A Practical 
Approach IRL Press at Oxford, Oxford, England; Harris and Angal Protein Purification 
Methods: A Practical Approach IRL Press at Oxford, Oxford, England; Scopes (1993) 
30 Protein Purification: Principles and Practice 3 rd Edition Springer Verlag, NY; J anson and 
Ryden (1998) Protein Purification: Principles. High Resolution Meth ods and Applications, 
Second Edition Wiley-VCH, NY; and Walker (1998) Protein Proto cols on CD-ROM 
Humana Press, NJ. 
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In some cases, it is desirable to produce the GAT polypeptide of the 
invention in a large scale suitable for industrial and/or commercial applications. In such 
cases bulk fermentation procedures are employed. Briefly, a GAT polynucleotide, e.g., a 
polynucleotide comprising any one of SEQ ID NOS: 1-5 and 1 1-262. or other nucleic 

5 acids encoding GAT polypeptides of the invention can be cloned into an expression 
vector. For example, U.S. Patent No. 5,955,310 to Widner et al. "METHODS FOR 
PRODUCING A POLYPEPTIDE IN A BACILLUS CELL," describes a vector with 
tandem promoters, and stabilizing sequences operably linked to a polypeptide encoding 
sequence. After inserting the polynucleotide of interest into a vector, the vector is 

10 tranformed into a bacterial, e.g., a Bacillus subtilis strain PL1801HE (amyE, apr, npr, 
spoIIE::Tn917) host. The introduction of an expression vector into a Bacillus cell may, 
for instance, be effected by protoplast transformation (see, e.g., Chang and Cohen (1979) 
Molecular General Genetics 168:111). by using competent cells (see, e.g., Young and 
Spizizin (1961) Journal of Bacteriology 81:823, or Dubnau and Davidoff-Abelson (1971) 

15 Journal of Molecular Biology 56:209), by electroporation (see, e.g., Shigekawa and Dower 
(1988) Biotechniques 6:742), or by conjugation (see, e.g., Koehler and Thorne (1987) 
Journal of Bacteriology 169:5271), also Ausubel, Sambrook and Berger, all supra. 

The transformed cells are cultivated in a nutrient medium suitable for 
production of the polypeptide using methods that are known in the art. For example, the 

20 cell may be cultivated by shake flask cultivation, small-scale or large-scale fermentation 
(including continuous, batch, fed-batch, or solid state fermentations) in laboratory or 
industrial fermentors performed in a suitable medium and under conditions allowing the 
polypeptide to be expressed and/or isolated. The cultivation takes place in a suitable 
nutrient medium comprising carbon and nitrogen sources and inorganic salts, using 

25 procedures known in the art. Suitable media are available from commercial suppliers or 
may be prepared according to published compositions (e.g., in catalogues of the American 
Type Culture Collection). The secreted polypeptide can be recovered directly from the 
medium. 

The resulting polypeptide may be isolated by methods known in the art. For 

30 example, the polypeptide may be isolated from the nutrient medium by conventional 

procedures including, but not limited to, centrifugation, filtration, extraction, spray-drying, 

evaporation, or precipitation. The isolated polypeptide may then be further purified by a 

variety of procedures known in the art including, but not limited to, chromatography (e.g., 

ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion), 
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electrophoretic procedures (e.g., preparative isoelectric focusing), differential solubility 
(e.g., ammonium sulfate precipitation), or extraction (see, e.g., Bollag et al. (1996) Protein 
Methods, 2 nd Edition Wiley-Liss, NY; Walker (1996) The Protein Protocols Handbook 
Humana Press, NJ; Bollag et al. (1996) Protein Methods, 2 nd Edition Wiley-liss, NY; 
Walker (1996) The Protein Protocols Handbook Humana Press, NJ). 

Cell-free transcription/translation systems can also be employed to produce 
polypeptides using DNAs or RNAs of the present invention. Several such systems are 
commercially available. A general guide to in vitro transcription and translation protocols 
is found in Tymms (1995) In vitro Transcription and Translation Protocols: Methods in 
Molecular Biology Volume 37, Garland Publishing, NY. 

SUBSTRATES AND FORMATS FOR SEQUENCE RECOMBINATION 

The polynucleotides of the invention are optionally used as substrates for a 
variety of diversity generating procedures, e.g., mutation, recombination and recursive 
recombination reactions, in addition to their use in standard cloning methods as set forth 
in, e.g., Ausubel, Berger and Sambrook, i.e., to produce additional GAT polynucleotides 
and polypeptides with desired properties. A variety of diversity generating protocols are 
available and described in the art. The procedures can be used separately, and/or in 
combination to produce one or more variants of a polynucleotide or set of polynucleotides, 
as well variants of encoded proteins. Individually and collectively, these procedures 
provide robust, widely applicable ways of generating diversified polynucleotides and sets 
of polynucleotides (including, e.g., polynucleotide libraries) useful, e.g., for the 
engineering or rapid evolution of polynucleotides, proteins, pathways, cells and/or 
organisms with new and/or improved characteristics. The process of altering the sequence 
can result in, for example, single nucleotide substitutions, multiple nucleotide 
substitutions, and insertion or deletion of regions of the nucleic acid sequence. 

While distinctions and classifications are made in the course of the ensuing 
discussion for clarity, it will be appreciated that the techniques are often not mutually 
exclusive. Indeed, the various methods can be used singly or in combination, in parallel or 
in series, to access diverse sequence variants. 

The result of any of the diversity generating procedures described herein 
can be the generation of one or more polynucleotides, which can be selected or screened 
for polynucleotides that encode proteins with or which confer desirable properties. 
Following diversification by one or more of the methods herein, or otherwise available to 
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one of skill, any polynucleotides that are produced can be selected for a desired activity or 

property, e.g. altered Km for glyphosate, altered Km for acetyl CoA, use of alternative 

cofactors (e.g., propionyl CoA) increased kcat, etc. This can include identifying any 

activity that can be detected, for example, in an automated or automatable format, by any 

of the assays in the art. For example, GAT homologs with increased specific activity can 

be detected by assaying the conversion of glyphosate to N-acetylglyphosate, e.g., by mass 

spectrometry. Alternatively, improved ability to confer resistance to glyphosate can be 

assayed by growing bacteria transformed with a nucleic acid of the invention on agar 

containing increasing concentrations of glyphosate or by spraying transgenic plants 

incorporating a nucleic acid of the invention with glyphosate. A variety of related (or 

even unrelated) properties can be evaluated, in serial or in parallel, at the discretion of the 

practitioner. Additional details regarding recombination and selection for herbicide 

tolerance can be found, e.g., in "DNA SHUFFLING TO PRODUCE HERBICIDE 

RESISTANT CROPS" (USSN 09/373,333) filed August 12,1999. 

Descriptions of a variety of diversity generating procedures, including 

family shuffling and methods for generating modified nucleic acid sequences encoding 

multiple enzymatic domains, are found the following publications and the references cited 

therein: Soong, N. et al. (2000) "Molecular breeding of viruses" Nat Genet 25(4):436-39; 

Stemmer, et al. (1999) "Molecular breeding of viruses for targeting and other clinical 

properties" Tumor Targeting 4:1-4; Ness et al. (1999) "DNA Shuffling of subgenomic 

sequences of subtilisin" Nature Biotechnology 17:893-896; Chang et al. (1999) "Evolution 

of a cytokine using DNA family shuffling" Nature Biotechnology 17:793-797; Minshull 

and Stemmer (1999) "Protein evolution by molecular breeding" Current Opinion in 

Chemical Biology 3:284-290; Christians et al. (1999) 'Directed evolution of thymidine 

kinase for AZT phosphorylation using DNA family shuffling" Nature Biotechnology 

17:259-264; Crameri et al. (1998) "DNA shuffling of a family of genes from diverse 

species accelerates directed evolution" Nature 391:288-291; Crameri et al. (1997) 

"Molecular evolution of an arsenate detoxification pathway by DNA shuffling," Nature 

Biotechnology 15:436-438; Zhang et al. (1997) "Directed evolution of an effective 

fucosidase from a galactosidase by DNA shuffling and screening" Proc. Natl. Acad. Sci. 

USA 94:4504-4509; Patten et al. (1997) "Applications of DNA Shuffling to 

Pharmaceuticals and Vaccines" Current Opinion in Biotechnology 8:724-733; Crameri et 

al. (1996) "Construction and evolution of antibody-phage libraries by DNA shuffling" 

Nature Medicine 2:100-103; Crameri et al. (1996) "Improved green fluorescent protein by 

-62- 



WO 02/36782 



PCT/US01/46227 



molecular evolution using DNA shuffling" Nature Biotechnology 14:315-319; Gates et al. 
(1996) "Affinity selective isolation of ligands from peptide libraries through display on a 
lac repressor "headpiece dimer'" Journal of Molecular Biology 255:373-386; Stemmer 
(1996) "Sexual PGR and Assembly PCR" In: The Encyclopedia of Molecular Biology. 
VCH Publishers, New York, pp.447-457; Crameri and Stemmer (1995) "Combinatorial 
multiple cassette mutagenesis creates all the permutations of mutant and wildtype 
cassettes" BioTechniques 18:194-195; Stemmer et al., (1995) "Single-step assembly of a 
gene and entire plasmid form large numbers of oligodeoxy-ribonucleotides" Gene, 
164:49-53; Stemmer (1995) "The Evolution of Molecular Computation" Science 270: 
1510; Stemmer (1995) "Searching Sequence Space" Bio/Technology 13:549-553; 
Stemmer (1994) "Rapid evolution of a protein in vitro by DNA shuffling" Nature 
370:389-391; and Stemmer (1994) "DNA shuffling by random fragmentation and 
reassembly: In vitro recombination for molecular evolution." Proc. Natl. Acad. Sci. USA 
91:10747-10751. 

Mutational methods of generating diversity include, for example, site- 
directed mutagenesis (Ling et al. (1997) "Approaches to DNA mutagenesis: an overview" 
Anal Biochem. 254(2): 157-178; Dale et al. (1996) "Oligonucleotide-directed random 
mutagenesis using the phosphorothioate method" Methods Mol. Biol. 57:369-374; Smith 

(1985) "In vitro mutagenesis" Ann. Rev. Genet. 19:423-462; Botstein & Shortle (1985) 
"Strategies and applications of in vitro mutagenesis" Science 229:1193-1201; Carter 

(1986) "Site-directed mutagenesis" Biochem. J. 237:1-7; and Kunkel (1987) 'The 
efficiency of oligonucleotide directed mutagenesis" in Nucleic Acids & Molecular 
Biology (Eckstein, F. and Lilley, D.MJ. eds., Springer Verlag, Berlin)); mutagenesis 
using uracil containing templates (Kunkel (1985) "Rapid and efficient site-specific 
mutagenesis without phenotypic selection" Proc. Nad. Acad. Sci. USA 82:488-492; 
Kunkel et al. (1987) "Rapid and efficient site-specific mutagenesis without phenotypic 
selection" Methods in Enzymol. 154, 367-382; and Bass et al. (1988) "Mutant Trp 
repressors with new DNA-binding specificities" Science 242:240-245); oligonucleotide- 
directed mutagenesis (Methods in Enzymol. 100: 468-500 (1983); Methods in Enzymol. 
154: 329-350 (1987); Zoller & Smith (1982) "Oligonucleotide-directed mutagenesis using 
M13-derived vectors: an efficient and general procedure for the production of point 
mutations in any DNA fragment" Nucleic Acids Res. 10:6487-6500; Zoller & Smith 
(1983) "Oligonucleotide-directed mutagenesis of DNA fragments cloned into M13 

vectors" Methods in Enzymol. 100:468-500; and Zoller & Smith (1987) "Oligonucleotide- 
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directed mutagenesis: a simple method using two oligonucleotide primers and a single- 
stranded DNA template" Methods in Enzymol. 154:329-350); phosphorothioate-modified 
DNA mutagenesis (Taylor et al. (1985) "The use of phosphorothioate-modified DNA in 
restriction enzyme reactions to prepare nicked DNA" Nucl. Acids Res. 13: 8749-8764; 
Taylor et al. (1985) 'The rapid generation of oligonucleotide-directed mutations at high 
frequency using phosphorothioate-modified DNA" Nucl. Acids Res. 13: 8765-8787 
(1985); Nakamaye & Eckstein (1986) "Inhibition of restriction endonuclease Nci I 
cleavage by phosphorothioate groups and its application to oligonucleotide-directed 
mutagenesis" Nucl. Acids Res. 14: 9679-9698; Sayers et al. (1988) "Y-T Exonucleases in 
phosphorothioate-based oligonucleotide-directed mutagenesis" Nucl. Acids Res. 16:791- 
802; and Sayers et al. (1988) "Strand specific cleavage of phosphorothioate-containing 
DNA by reaction with restriction endonucleases in the presence of ethidium bromide" 
Nucl. Acids Res. 16: 803-814); mutagenesis using gapped duplex DNA (Kramer et al. 
(1984) "The gapped duplex DNA approach to oligonucleotide-directed mutation 
construction" Nucl. Acids Res. 12: 9441-9456; Kramer & Fritz (1987) Methods in 
Enzymol. "Oligonucleotide-directed construction of mutations via gapped duplex DNA" 
154:350-367; Kramer et al. (1988) "Improved enzymatic in vitro reactions in the gapped 
duplex DNA approach to oligonucleotide-directed construction of mutations" Nucl. Acids 
Res. 16: 7207; and Fritz et al. (1988) "Oligonucleotide-directed construction of mutations: 
a gapped duplex DNA procedure without enzymatic reactions in vitro" Nucl. Acids Res. 
16: 6987-6999). 

Additional suitable methods include point mismatch repair (Kramer et al. 
(1984) "Point Mismatch Repair" Cell 38:879-887), mutagenesis using repair-deficient host 
strains (Carter et al. (1985) "Improved oligonucleotide site-directed mutagenesis using 
M13 vectors" Nucl. Acids Res. 13: 4431-4443; and Carter (1987) "Improved 
oligonucleotide-directed mutagenesis using M13 vectors" Methods in Enzymol. 154: 382- 
403), deletion mutagenesis (Eghtedarzadeh & Henikoff (1986) "Use of oligonucleotides to 
generate large deletions" Nucl. Acids Res. 14: 5115), restriction-selection and restriction- 
selection and restriction-purification (Wells et al. (1986) "Importance of hydrogen-bond 
formation in stabilizing the transition state of subtilisin" Phil. Trans. R. Soc. Lond. A 317: 
415-423), mutagenesis by total gene synthesis (Nambiar et al. (1984) "Total synthesis and 
cloning of a gene coding for the ribonuclease S protein" Science 223: 1299-1301; Sakamar 
and Khorana (1988) 'Total synthesis and expression of a gene for the a-subunit of bovine 

rod outer segment guanine nucleotide-binding protein (transducin)" Nucl. Acids Res. 14: 
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6361-6372; Wells et al. (1985) "Cassette mutagenesis: an efficient method for generation 

of multiple mutations at defined sites" Gene 34:315-323; and Grundstrdm et al. (1985) 

"Oligonucleotide-directed mutagenesis by microscale 'shot-gun' gene synthesis" Nucl. 

Acids Res. 13: 3305-3316), double-strand break repair (Mandecki (1986); Arnold (1993) 

"Protein engineering for unusual environments" Current Opinion in Biotechnology 4:450- 

455. "Oligonucleotide-directed double-strand break repair in plasmids of Escherichia coli: 

a method for site-specific mutagenesis" Proc. Natl. Acad. Sci. USA, 83:7177-7181). 

Additional details on many of the above methods can be found in Methods in Enzymology 

Volume 154, which also describes useful controls for trouble-shooting problems with 

various mutagenesis methods. 

Additional details regarding various diversity generating methods can be 

found in the following U.S. patents, PCT publications, and EPO publications: U.S. Pat. 

No. 5,605,793 to Stemmer (February 25, 1997), "Methods for In Vitro Recombination;" 

U.S. Pat. No. 5,811,238 to Stemmer et al. (September 22, 1998) 'Methods for Generating 

Polynucleotides having Desired Characteristics by Iterative Selection and 

Recombination;" U.S. Pat. No. 5,830,721 to Stemmer et al. (November 3, 1998), "DNA 

Mutagenesis by Random Fragmentation and Reassembly;" U.S. Pat. No. 5,834,252 to 

Stemmer, et al. (November 10, 1998) "End-Complementary Polymerase Reaction;" U.S. 

Pat. No. 5,837,458 to Minshull, et al. (November 17, 1998), "Methods and Compositions 

for Cellular and Metabolic Engineering;" WO 95/22625, Stemmer and Crameri, 

"Mutagenesis by Random Fragmentation and Reassembly;" WO 96/33207 by Stemmer 

and Lipschutz "End Complementary Polymerase Chain Reaction;" WO 97/20078 by 

Stemmer and Crameri "Methods for Generating Polynucleotides having Desired 

Characteristics by Iterative Selection and Recombination;" WO 97/35966 by Minshull and 

Stemmer, "Methods and Compositions for Cellular and Metabolic Engineering;" WO 

99/41402 by Punnonen et al. 'Targeting of Genetic Vaccine Vectors;" WO 99/41383 by 

Punnonen et al. "Antigen Library Immunization;" WO 99/41369 by Punnonen et al. 

"Genetic Vaccine Vector Engineering;" WO 99/41368 by Punnonen et al. "Optimization 

of Immunomodulatory Properties of Genetic Vaccines;" EP 752008 by Stemmer and 

Crameri, "DNA Mutagenesis by Random Fragmentation and Reassembly;" EP 0932670 

by Stemmer "Evolving Cellular DNA Uptake by Recursive Sequence Recombination;" 

WO 99/23107 by Stemmer et al., "Modification of Virus Tropism and Host Range by 

Viral Genome Shuffling;" WO 99/21979 by Apt et al., "Human Papillomavirus Vectors;" 

WO 98/31837 by del Cardayre et al. "Evolution of Whole Cells and Organisms by 
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Recursive Sequence Recombination;" WO 98/27230 by Patten and Stemmer, "Methods 

and Compositions for Polypeptide Engineering;" WO 98/13487 by Stemmer et aL, 

"Methods for Optimization of Gene Therapy by Recursive Sequence Shuffling and 

Selection," WO 00/00632, "Methods for Generating Highly Diverse Libraries," WO 

00/09679, "Methods for Obtaining in Vitro Recombined Polynucleotide Sequence Banks 

and Resulting Sequences," WO 98/42832 by Arnold et aL, "Recombination of 

Polynucleotide Sequences Using Random or Defined Primers," WO 99/29902 by Arnold 

et aL, "Method for Creating Polynucleotide and Polypeptide Sequences," WO 98/41653 

by Vind, "An in Vitro Method for Construction of a DNA Library," WO 98/41622 by 

Borchert et aL, "Method for Constructing a library Using DNA Shuffling," and WO 

98/42727 by Pati and Zarling, "Sequence Alterations using Homologous Recombination," 

WO 00/18906 by Patten et aL, "Shuffling of Codon-Altered Genes;" WO 00/04190 by del 

Cardayre et aL "Evolution of Whole Cells and Organisms by Recursive Recombination;" 

WO 00/42561 by Crameri et aL, "Oligonucleotide Mediated Nucleic Acid 

Recombination;" WO 00/42559 by Selifonov and Stemmer "Methods of Populating Data 

Structures for Use in Evolutionary Simulations;" WO 00/42560 by Selifonov et aL, 

"Methods for Making Character Strings, Polynucleotides & Polypeptides Having Desired 

Characteristics;" WO 01/23401 by Welch et aL, "Use of Codon- Varied Oligonucleotide 

Synthesis for Synthetic Shuffling;" and PCT/US01/06775 "Single-Stranded Nucleic Acid 

Template-Mediated Recombination and Nucleic Acid Fragment Isolation" by Affholter. 

Certain U.S. applications provide additional details regarding various 

diversity generating methods, including "SHUFFLING OF CODON ALTERED GENES" 

by Patten et aL filed September 28, 1999, (USSN 09/407,800); "EVOLUTION OF 

WHOLE CELLS AND ORGANISMS BY RECURSIVE SEQUENCE 

RECOMBINATION", by del Cardayre et aL filed July 15, 1998 (USSN 09/166,188), and 

July 15, 1999 (USSN 09/354,922); "OLIGONUCLEOTIDE MEDIATED NUCLEIC 

ACID RECOMBINATION" by Crameri et aL, filed September 28, 1999 (USSN 

09/408,392), and "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID 

RECOMBINATION" by Crameri et aL, filed January 18, 2000 (PCT/US00/01203); "USE 

OF CODON-B ASED OLIGONUCLEOTIDE SYNTHESIS FOR SYNTHETIC 

SHUFFLING" by Welch et aL, filed September 28, 1999 (USSN 09/408,393); 

"METHODS FOR MAKING CHARACTER STRINGS, POLYNUCLEOTIDES & 

POLYPEPTIDES HAVING DESIRED CHARACTERISTICS" by Selifonov et aL, filed 

January 18, 2000, (PCT/US00/01202) and, e.g., "METHODS FOR MAKING 
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CHARACTER STRINGS, POLYNUCLEOTIDES & POLYPEPTIDES HAVING 
DESIRED CHARACTERISTICS" by Selifonov et al., filed July 18, 2000 (USSN 
09/618,579); "METHODS OF POPULATING DATA STRUCTURES FOR USE IN 
EVOLUTIONARY SIMULATIONS" by Selifonov and Stemmer (PCT/US00/01138), 
filed January 18, 2000; and "SINGLE-STRANDED NUCLEIC ACID TEMPLATE- 
MEDIATED RECOMBINATION AND NUCLEIC ACID FRAGMENT ISOLATION" 
by Affholter (USSN 60/186,482, filed March 2, 2000). 

In brief, several different general classes of sequence modification 
methods, such as mutation, recombination, etc. are applicable to the present invention and 
set forth, e.g., in the references above. That is, alterations to the component nucleic acid 
sequences to produced modified gene fusion constructs can be performed by any number 
of the protocols described, either before cojoining of the sequences, or after the cojoining 
step. The following exemplify some of the different types of preferred formats for 
diversity generation in the context of the present invention, including, e.g., certain 
recombination based diversity generation formats. 

Nucleic acids can be recombined in vitro by any of a variety of techniques 
discussed in the references above, including e.g., DNAse digestion of nucleic acids to be 
recombined followed by ligation and/or PCR reassembly of the nucleic acids. For 
example, sexual PCR mutagenesis can be used in which random (or pseudo random, or 
even non-random) fragmentation of the DNA molecule is followed by recombination, 
based on sequence similarity, between DNA molecules with different but related DNA 
sequences, in vitro, followed by fixation of the crossover by extension in a polymerase 
chain reaction. This process and many process variants is described in several of the 
references above, e.g., in Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747-10751. 

Similarly, nucleic acids can be recursively recombined in vivo, e.g., by 
allowing recombination to occur between nucleic acids in cells. Many such in vivo 
recombination formats are set forth in the references noted above. Such formats 
optionally provide direct recombination between nucleic acids of interest, or provide 
recombination between vectors, viruses, plasmids, etc., comprising the nucleic acids of 
interest, as well as other formats. Details regarding such procedures are found in the 
references noted above. 

Whole genome recombination methods can also be used in which whole 
genomes of cells or other organisms are recombined, optionally including spiking of the 

genomic recombination mixtures with desired library components (e.g., genes 
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corresponding to the pathways of the present invention). These methods have many 

applications, including those in which the identity of a target gene is not known. Details 

on such methods are found, e.g., in WO 98/31837 by del Cardayre et al. "Evolution of 

Whole Cells and Organisms by Recursive Sequence Recombination;" and in, e.g., 

PCT/US99/15972 by del Cardayre et al., also entitled "Evolution of Whole Cells and 

Organisms by Recursive Sequence Recombination." Thus, any of these processes and 

techniques for recombination, recursive recombination, and whole genome recombination, 

alone or in combination, can be used to generate the modified nucleic acid sequences 

and/or modified gene fusion constructs of the present invention. 

Synthetic recombination methods can also be used, in which 

oligonucleotides corresponding to targets of interest are synthesized and reassembled in 

PCR or ligation reactions which include oligonucleotides which correspond to more than 

one parental nucleic acid, thereby generating new recombined nucleic acids. 

Oligonucleotides can be made by standard nucleotide addition methods, or can be made, 

e.g., by tri-nucleotide synthetic approaches. Details regarding such approaches are found 

in the references noted above, including, e.g., WO 00/42561 by Crameri et al., 

"Olgonucleotide Mediated Nucleic Acid Recombination;" WO 01/23401 by Welch et al., 

"Use of Codon- Varied Oligonucleotide Synthesis for Synthetic Shuffling;" WO 00/42560 

by Selifonov et al., "Methods for Making Character Strings, Polynucleotides and 

Polypeptides Having Desired Characteristics;" and WO 00/42559 by Selifonov and 

Stemmer "Methods of Populating Data Structures for Use in Evolutionary Simulations." 

In silico methods of recombination can be effected in which genetic 

algorithms are used in a computer to recombine sequence strings which correspond to 

homologous (or even non-homologous) nucleic acids. The resulting recombined sequence 

strings are optionally converted into nucleic acids by synthesis of nucleic acids which 

correspond to the recombined sequences, e.g., in concert with oligonucleotide synthesis/ 

gene reassembly techniques. This approach can generate random, partially random or 

designed variants. Many details regarding in silico recombination, including the use of 

genetic algorithms, genetic operators and the like in computer systems, combined with 

generation of corresponding nucleic acids (and/or proteins), as well as combinations of 

designed nucleic acids and/or proteins (e.g., based on cross-over site selection) as well as 

designed, pseudo-random or random recombination methods are described in WO 

00/42560 by Selifonov et al., "Methods for Making Character Strings, Polynucleotides and 

Polypeptides Having Desired Characteristics" and WO 00/42559 by Selifonov and 
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Stemmer "Methods of Populating Data Structures for Use in Evolutionary Simulations." 
Extensive details regarding in silico recombination methods are found in these 
applications. This methodology is generally applicable to the present invention in 
providing for recombination of nucleic acid sequences and/or gene fusion constructs 
encoding proteins involved in various metabolic pathways (such as, for example, 
carotenoid biosynthetic pathways, ectoine biosynthetic pathways, polyhydroxyalkanoate 
biosynthetic pathways, aromatic polyketide biosynthetic pathways, and the like) in silico 
and/ or the generation of corresponding nucleic acids or proteins. 

Many methods of accessing natural diversity, e.g., by hybridization of 
diverse nucleic acids or nucleic acid fragments to single-stranded templates, followed by 
polymerization and/or ligation to regenerate full-length sequences, optionally followed by 
degradation of the templates and recovery of the resulting modified nucleic acids can be 
similarly used. In one method employing a single-stranded template, the fragment 
population derived from the genomic library(ies) is annealed with partial, or, often 
approximately full length ssDNA or RNA corresponding to the opposite strand. Assembly 
of complex chimeric genes from this population is then mediated by nuclease-base 
removal of non-hybridizing fragment ends, polymerization to fill gaps between such 
fragments and subsequent single stranded ligation. The parental polynucleotide strand can 
be removed by digestion (e.g., if RNA or uracil-containing), magnetic separation under 
denaturing conditions (if labeled in a manner conducive to such separation) and other 
available separation/purification methods. Alternatively, the parental strand is optionally 
co-purified with the chimeric strands and removed during subsequent screening and 
processing steps. Additional details regarding this approach are found, e.g., in "Single- 
Stranded Nucleic Acid Template-Mediated Recombination and Nucleic Acid Fragment 
Isolation" by Affholter, PCT/US01/06775. 

In another approach, single-stranded molecules are converted to double- 
stranded DNA (dsDNA) and the dsDNA molecules are bound to a solid support by ligand- 
mediated binding. After separation of unbound DNA, the selected DNA molecules are 
released from the support and introduced into a suitable host cell to generate a library 
enriched sequences which hybridize to the probe. A library produced in this manner 
provides a desirable substrate for further diversification using any of the procedures 
described herein. 

Any of the preceding general recombination formats can be practiced in a 

reiterative fashion (e.g., one or more cycles of mutation/recombination or other diversity 
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generation methods, optionally followed by one or more selection methods) to generate a 
more diverse set of recombinant nucleic acids. 

Mutagenesis employing polynucleotide chain termination methods have 
also been proposed {see e.g., U.S. Patent No. 5,965,408, "Method of DNA reassembly by 
interrupting synthesis" to Short, and the references above), and can be applied to the 
present invention. In this approach, double stranded DNAs corresponding to one or more 
genes sharing regions of sequence similarity are combined and denatured, in the presence 
or absence of primers specific for the gene. The single stranded polynucleotides are then 
annealed and incubated in the presence of a polymerase and a chain terminating reagent 
(e.g., ultraviolet, gamma or X-ray irradiation; ethidium bromide or other intercalators; 
DNA binding proteins, such as single strand binding proteins, transcription activating 
factors, or histones; polycyclic aromatic hydrocarbons; trivalent chromium or a trivalent 
chromium salt; or abbreviated polymerization mediated by rapid thermocycling; and the 
like), resulting in the production of partial duplex molecules. The partial duplex 
molecules, e.g., containing partially extended chains, are then denatured and reannealed in 
subsequent rounds of replication or partial replication resulting in polynucleotides which 
share varying degrees of sequence similarity and which are diversified with respect to the 
starting population of DNA molecules. Optionally, the products, or partial pools of the 
products, can be amplified at one or more stages in the process. Polynucleotides produced 
by a chain termination method, such as described above, are suitable substrates for any 
other described recombination format. 

Diversity also can be generated in nucleic acids or populations of nucleic 
acids using a recombinational procedure termed "incremental truncation for the creation of 
hybrid enzymes" ("ITCHY") described in Ostermeier et al. (1999) "A combinatorial 
approach to hybrid enzymes independent of DNA homology" Nature Biotech 17:1205. 
This approach can be used to generate an initial a library of variants which can optionally 
serve as a substrate for one or more in vitro or in vivo recombination methods. See, also, 
Ostermeier et al. (1999) "Combinatorial Protein Engineering by Incremental Truncation," 
Proc. Natl. Acad. Sci. USA, 96: 3562-67; Ostermeier et al. (1999), "Incremental 
Truncation as a Strategy in the Engineering of Novel Biocatalysts," Biological and 
Medicinal Chemistry, 7: 2139-44. 

Mutational methods which result in the alteration of individual nucleotides 
or groups of contiguous or non-contiguous nucleotides can be favorably employed to 

introduce nucleotide diversity into the nucleic acid sequences and/or gene fusion 
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constructs of the present invention. Many mutagenesis methods are found in the above- 
cited references; additional details regarding mutagenesis methods can be found in 
following, which can also be applied to the present invention. 

For example, error-prone PCR can be used to generate nucleic acid 
variants. Using this technique, PCR is perfonned under conditions where the copying 
fidelity of the DNA polymerase is low, such that a high rate of point mutations is obtained 
along the entire length of the PCR product. Examples of such techniques are found in the 
references above and, e.g., in Leung et al. (1989) Technique 1:11-15 and Caldwell et al. 
(1992) PCR Methods Applic. 2:28-33. Similarly, assembly PCR can be used, in a process 
which involves the assembly of a PCR product from a mixture of small DNA fragments. 
A large number of different PCR reactions can occur in parallel in the same reaction 
mixture, with the products of one reaction priming the products of another reaction. 

Oligonucleotide directed mutagenesis can be used to introduce site-specific 
mutations in a nucleic acid sequence of interest. Examples of such techniques are found in 
the references above and, e.g., in Reidhaar-Olson et al. (1988) Science , 241:53-57. 
Similarly, cassette mutagenesis can be used in a process that replaces a small region of a 
double stranded DNA molecule with a synthetic oligonucleotide cassette that differs from 
the native sequence. The oligonucleotide can contain, e.g., completely and/or partially 
randomized native sequence(s). 

Recursive ensemble mutagenesis is a process in which an algorithm for 
protein mutagenesis is used to produce diverse populations of phenotypically related 
mutants, members of which differ in amino acid sequence. This method uses a feedback 
mechanism to monitor successive rounds of combinatorial cassette mutagenesis. 
Examples of this approach are found in Arkin & Youvan (1992) Proc. Natl. Acad. Sci. 
USA 89:7811-7815. 

Exponential ensemble mutagenesis can be used for generating 
combinatorial libraries with a high percentage of unique and functional mutants. Small 
groups of residues in a sequence of interest are randomized in parallel to identify, at each 
altered position, amino acids which lead to functional proteins. Examples of such 
procedures are found in Delegrave & Youvan (1993) Biotechnolo gy Research 11:1548- 
1552. 

In vivo mutagenesis can be used to generate random mutations in any 

cloned DNA of interest by propagating the DNA, e.g., in a strain of E. coli that carries 

mutations in one or more of the DNA repair pathways. These "mutator" strains have a 
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higher random mutation rate than that of a wild-type parent. Propagating the DNA in one 

of these strains will eventually generate random mutations within the DNA. Such 

procedures are described in the references noted above. 

Other procedures for introducing diversity into a genome, e.g. a bacterial, 

5 fungal, animal or plant genome can be used in conjunction with the above described 

and/or referenced methods. For example, in addition to the methods above, techniques 

have been proposed which produce nucleic acid multimers suitable for transformation into 

a variety of species (see, e.g., Schellenberger U.S. Patent No. 5,756,316 and the references 

above). Transformation of a suitable host with such multimers, consisting of genes that 

10 are divergent with respect to one another, (e.g., derived from natural diversity or through 

application of site directed mutagenesis, error prone PCR, passage through mutagenic 

bacterial strains, and the like), provides a source of nucleic acid diversity for DNA 

diversification, e.g., by an in vivo recombination process as indicated above. 

Alternatively, a multiplicity of monomelic polynucleotides sharing regions 

15 of partial sequence similarity can be transformed into a host species and recombined in 

vivo by the host cell. Subsequent rounds of cell division can be used to generate libraries, 

members of which, include a single, homogenous population, or pool of monomelic 

polynucleotides. Alternatively, the monomelic nucleic acid can be recovered by standard 

techniques, e.g., PCR and/or cloning, and recombined in any of the recombination 

20 formats, including recursive recombination formats, described above. 

Methods for generating multispecies expression libraries, have been 

described (in addition to the reference noted above, see, e.g., Peterson et al. (1998) U.S. 

Pat. No. 5,783,431 "METHODS FOR GENERATING AND SCREENING NOVEL 

METABOLIC PATHWAYS," and Thompson, et al. (1998) U.S. Pat. No. 5,824,485 

25 METHODS FOR GENERATING AND SCREENING NOVEL METABOLIC 

PATHWAYS) and their use to identify protein activities of interest has been proposed (In 

addition to the references noted above, see, Short (1999) U.S. Pat. No. 5,958,672 

"PROTEIN ACTIVITY SCREENING OF CLONES HAVING DNA FROM 

UNCULTIVATED MICROORGANISMS"). Multispecies expression libraries include, in 

30 general, libraries comprising cDNA or genomic sequences from a plurality of species or 

strains, operably linked to appropriate regulatory sequences, in an expression cassette. 

The cDNA and/or genomic sequences are optionally randomly ligated to further enhance 

diversity. The vector can be a shuttle vector suitable for transformation and expression in 

more than one species of host organism, e.g., bacterial species, eukaryotic cells. In some 
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cases, the library is biased by preselecting sequences which encode a protein of interest, or 
which hybridize to a nucleic acid of interest. Any such libraries can be provided as 
substrates for any of the methods herein described. 

The above described procedures have been largely directed to increasing 
5 nucleic acid and/ or encoded protein diversity. However, in many cases, not all of the 

diversity is useful, e.g., functional, and contributes merely to increasing the background of 
variants that must be screened or selected to identify the few favorable variants. In some 
applications, it is desirable to preselect or prescreen libraries (e.g., an amplified library, a 
genomic library, a cDNA library, a normalized library, etc.) or other substrate nucleic 
10 acids prior to diversification, e.g., by recombination-based mutagenesis procedures, or to 
otherwise bias the substrates towards nucleic acids that encode functional products. For 
example, in the case of antibody engineering, it is possible to bias the diversity generating 
process toward antibodies with functional antigen binding sites by taking advantage of in 
vivo recombination events prior to manipulation by any of the described methods. For 
15 example, recombined CDRs derived from B cell cDNA libraries can be amplified and 
assembled into framework regions (e.g., Jirholt et al. (1998) "Exploiting sequence space: 
shuffling in vivo formed complementarity determining regions into a master framework" 
Gene 215: 471) prior to diversifying according to any of the methods described herein. 

Libraries can be biased towards nucleic acids which encode proteins with 
20 desirable enzyme activities. For example, after identifying a clone from a library which 
exhibits a specified activity, the clone can be mutagenized using any known method for 
introducing DNA alterations. A library comprising the mutagenized homologues is then 
screened for a desired activity, which can be the same as or different from the initially 
specified activity. An example of such a procedure is proposed in Short (1999) U.S. 
25 Patent No. 5,939,250 for "PRODUCTION OF ENZYMES HAVING DESIRED 

ACnVTITES BY MUTAGENESIS." Desired activities can be identified by any method 
known in the art. For example, WO 99/10539 proposes that gene libraries can be screened 
by combining extracts from the gene library with components obtained from metabolically 
rich cells and identifying combinations which exhibit the desired activity. It has also been 
30 proposed (e.g., WO 98/58085) that clones with desired activities can be identified by 
inserting bioactive substrates into samples of the library, and detecting bioactive 
fluorescence corresponding to the product of a desired activity using a fluorescent 
analyzer, e.g., a flow cytometry device, a CCD, a fluorometer, or a spectrophotometer. 
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Libraries can also be biased towards nucleic acids which have specified 
characteristics, e.g., hybridization to a selected nucleic acid probe. For example, 
application WO 99/10539 proposes that polynucleotides encoding a desired activity (e.g., 
an enzymatic activity, for example: a lipase, an esterase, a protease, a glycosidase, a 
5 glycosyl transferase, a phosphatase, a kinase, an oxygenase, a peroxidase, a hydrolase, a 
hydratase, a nitrilase, a transaminase, an amidase or an acylase) can be identified from 
among genomic DNA sequences in the following manner. Single stranded DNA 
molecules from a population of genomic DNA are hybridized to a ligand-conjugated 
probe. The genomic DNA can be derived from either a cultivated or uncultivated 
10 microorganism, or from an environmental sample. Alternatively, the genomic DNA can be 
derived from a multicellular organism, or a tissue derived therefrom. Second strand 
synthesis can be conducted directly from the hybridization probe used in the capture, with 
or without prior release from the capture medium or by a wide variety of other strategies 
known in the art. Alternatively, the isolated single-stranded genomic DNA population can 
15 be fragmented without further cloning and used directly in, e.g., a recombination-based 
approach, that employs a single-stranded template, as described above. 

"Non-Stochastic" methods of generating nucleic acids and polypeptides are 
alleged in Short "Non-Stochastic Generation of Genetic Vaccines and Enzymes" WO 
00/46344. These methods, including proposed non-stochastic polynucleotide reassembly 
20 and site-saturation mutagenesis methods be applied to the present invention as well. 

Random or semi-random mutagenesis using doped or degenerate oligonucleotides is also 
described in, e.g., Arkin and Youvan (1992) "Optimizing nucleotide mixtures to encode 
specific subsets of amino acids for semi-random mutagenesis" Biotechnology 10:297-300; 
Reidhaar-Olson et al. (1991) "Random mutagenesis of protein sequences using 
25 oligonucleotide cassettes" Methods EnzymoL 208:564-86; Lim and Sauer (1991) "The 
role of internal packing interactions in determining the structure and stability of a protein" 
7. Mol Biol 219:359-76; Breyer and Sauer (1989) "Mutational analysis of the fine 
specificity of binding of monoclonal antibody 5 IF to lambda repressor" /. Biol Chem. 
264:13355-60); and "Walk-Through Mutagenesis" (Crea, R; US Patents 5,830,650 and 
30 5,798,208, and EP Patent 0527809 Bl. 

It will readily be appreciated that any of the above described techniques 
suitable for enriching a library prior to diversification can also be used to screen the 
products, or libraries of products, produced by the diversity generating methods. Any of 
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the above described methods can be practiced recursively or in combination to alter 
nucleic acids, e.g., GAT encoding polynucleotides. 

Kits for mutagenesis, library construction and other diversity generation 
methods are also commercially available. For example, kits are available from, e.g., 
5 Stratagene (e.g., QuickChange™ site-directed mutagenesis kit; and Chameleon™ double- 
stranded, site-directed mutagenesis kit), Bio/Can Scientific, Bio-Rad (e.g., using the 
Kunkel method described above), Boehringer Mannheim Corp., Clonetech Laboratories, 
DNA Technologies, Epicentre Technologies (e.g., 5 prime 3 prime kit); Genpak Inc, 
Lemargo Inc, Life Technologies (Gibco BRL), New England Biolabs, Pharmacia Biotech, 
10 Promega Corp., Quantum Biotechnologies, Amersham International pic (e.g., using the 
Eckstein method above), and Anglian Biotechnology Ltd (e.g., using the Carter/Winter 
method above). 

The above references provide many mutational formats, including 
recombination, recursive recombination, recursive mutation and combinations or 

15 recombination with other forms of mutagenesis, as well as many modifications of these 

formats. Regardless of the diversity generation format that is used, the nucleic acids of the 
present invention can be recombined (with each other, or with related (or even unrelated) 
sequences) to produce a diverse set of recombinant nucleic acids for use in the gene fusion 
constructs and modified gene fusion constructs of the present invention, including, e.g., 

20 sets of homologous nucleic acids, as well as corresponding polypeptides. 

Many of the above-described methodologies for generating modified 
polynucleotides generate a large number of diverse variants of a parental sequence or 
sequences. In some preferred embodiments of the invention the modification technique 
(e.g., some form of shuffling) is used to generate a library of variants that is then screened 

25 for a modified polynucleotide or pool of modified polynucleotides encoding some desired 
functional attribute, e.g., improved GAT activity. Exemplary enzymatic activities that can 
be screened for include catalytic rates (conventionally characterized in terms of kinetic 
constants such as kcat and Km), substrate specificity, and susceptibility to activation or 
inhibition by substrate, product or other molecules (e.g., inhibitors or activators). 

30 One example of selection for a desired enzymatic activity entails growing 

host cells under conditions that inhibit the growth and/or survival of cells that do not 

sufficiently express an enzymatic activity of interest, e.g. the GAT activity. Using such a 

selection process can eliminate from consideration all modified polynucleotides except 

those encoding a desired enzymatic activity. For example, in some embodiments of the 
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invention host cells are maintained under conditions that inhibit cell growth or survival in 
the absence of sufficient levels of GAT, e.g., a concentration of glyphosate that is lethal or 
inhibits the growth of a wild-type plant of the same variety that lack does not express GAT 
polynucleotide. Under these conditions, only a host cell harboring a modified nucleic acid 
5 that encodes enzymatic activity or activities able to catalyze production of sufficient levels 
of the product will survive and grow. Some embodiments of the invention employ 
multiples rounds of screening at increasing concentrations of glyphosate or a glyphosate 
analog. 

In some embodiments of the invention, mass spectrometry is used to detect 
10 the acetylation of glyphosate, or a glyphosate analog or metabolite. The used of mass 
spectrometry is described in more detail in the Examples below. 

For convenience and high throughput it will often be desirable to 
screen/select for desired modified nucleic acids in a microorganism, e.g., a bacteria such 
as E. coli. On the other hand, screening in plant cells or plants can will in some cases be 
15 preferable where the ultimate aim is to generate a modified nucleic acid for expression in a 
plant system. 

In some preferred embodiments of the invention throughput is increased by 
screening pools of host cells expressing different modified nucleic acids, either alone or as 
part of a gene fusion construct. Any pools showing significant activity can be 
20 deconvolved to identify single clones expressing the desirable activity. 

The skilled artisan will recognize that the relevant assay, screening or 
selection method will vary depending upon the desired host organism, etc. It is normally 
advantageous to employ an assay that can be practiced in a high-throughput format. 

In high through put assays, it is possible to screen up to several thousand 
25 different variants in a single day. For example, each well of a microtiter plate can be used 
to run a separate assay, or, if concentration or incubation time effects are to be observed, 
every 5-10 wells can test a single variant. 

In addition to fluidic approaches, it is possible, as mentioned above, simply 
to grow cells on media plates that select for the desired enzymatic or metabolic function. 
30 This approach offers a simple and high-throughput screening method. 

A number of well known robotic systems have also been developed for 

solution phase chemistries useful in assay systems. These systems include automated 

workstations like the automated synthesis apparatus developed by Takeda Chemical 

Industries, LTD. (Osaka, Japan) and many robotic systems utilizing robotic arms (Zymate 
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II, Zymark Corporation, Hopkinton, MA.; Orca, Hewlett-Packard, Palo Alto, CA) which 
mimic the manual synthetic operations performed by a scientist. Any of the above devices 
are suitable for application to the present invention. The nature and implementation of 
modifications to these devices (if any) so that they can operate as discussed herein with 
5 reference to the integrated system will be apparent to persons skilled in the relevant art. 

High throughput screening systems are commercially available (see, e.g., 
Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These 
systems typically automate entire procedures including all sample and reagent pipetting, 

10 liquid dispensing, timed incubations, and final readings of the microplate in detector(s) 
appropriate for the assay. These configurable systems provide high throughput and rapid 
start up as well as a high degree of flexibility and customization. 

The manufacturers of such systems provide detailed protocols for the 
various high throughput devices. Thus, for example, Zymark Corp. provides technical 

15 bulletins describing screening systems for detecting the modulation of gene transcription, 
ligand binding, and the like. Microfluidic approaches to reagent manipulation have also 
been developed, e.g., by Caliper Technologies (Mountain View, CA). 

Optical images viewed (and, optionally, recorded) by a camera or other 
recording device (e.g., a photodiode and data storage device) are optionally further 

20 processed in any of the embodiments herein, e.g., by digitizing the image and/or storing 
and analyzing the image on a computer. A variety of commercially available peripheral 
equipment and software is available for digitizing, storing and analyzing a digitized video 
or digitized optical image, e.g., using PC (Intel x86 or pentium chip compatible DOS™, 
OS™ WINDOWS™, WINDOWS NT™ or WINDOWS 95™ based machines), 

25 MACINTOSH™, or UNIX based (e.g., SUN™ work station) computers. 

One conventional system carries light from the assay device to a cooled 
charge-coupled device (CCD) camera, a common use in the art. A CCD camera includes 
an array of picture elements (pixels). The light from the specimen is imaged on the CCD. 
Particular pixels corresponding to regions of the specimen (e.g., individual hybridization 

30 sites on an array of biological polymers) are sampled to obtain light intensity readings for 
each position. Multiple pixels are processed in parallel to increase speed. The apparatus 
and methods of the invention are easily used for viewing any sample, e.g. by fluorescent 
or dark field microscopic techniques. 
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OTHER POLYNUCLEOTIDE COMPOSITIONS 

The invention also includes compositions comprising two or more 
polynucleotides of the invention (e.g., as substrates for recombination). The composition 
can comprise a library of recombinant nucleic acids, where the library contains at least 2, 
3, 5, 10, 20, or 50 or more polynucleotides. The polynucleotides are optionally cloned 
into expression vectors, providing expression libraries. 

The invention also includes compositions produced by digesting one or 
more polynucleotide of the invention with a restriction endonuclease, an RNAse, or a 
DNAse (e.g., as is performed in certain of the recombination formats noted above); and 
compositions produced by fragmenting or shearing one or more polynucleotide of the 
invention by mechanical means (e.g., sonication, vortexing, and the like), which can also 
be used to provide substrates for recombination in the methods above. Similarly, 
compositions comprising sets of oligonucleotides corresponding to more than one nucleic 
acid of the invention are useful as recombination substrates and are a feature of the 
invention. For convenience, these fragmented, sheared, or oligonucleotide synthesized 
mixtures are referred to as fragmented nucleic acid sets. 

Also included in the invention are compositions produced by incubating 
one or more of the fragmented nucleic acid sets in the presence of ribonucleotide- or 
deoxyribonucelotide triphosphates and a nucleic acid polymerase. This resulting 
composition forms a recombination mixture for many of the recombination formats noted 
above. The nucleic acid polymerase may be an RNA polymerase, a DNA polymerase, or 
an RNA-directed DNA polymerase (e.g., a "reverse transcriptase"); the polymerase can 
be, e.g., a thermostable DNA polymerase (such as, VENT, TAQ, or the like). 

INTEGRATED SYSTEMS 

The present invention provides computers, computer readable media and 
integrated systems comprising character strings corresponding to the sequence information 
herein for the polypeptides and nucleic acids herein, including, e.g., those sequences listed 
herein and the various silent substitutions and conservative substitutions thereof. 

For example, various methods and genetic algorithms (GAs) known in the 
art can be used to detect homology or similarity between different character strings, or can 
be used to perform other desirable functions such as to control output files, provide the 
basis for making presentations of information including the sequences and the like. 
Examples include BLAST, discussed supra. 
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Thus, different types of homology and similarity of various stringency and 
length can be detected and recognized in the integrated systems herein. For example, 
many homology determination methods have been designed for comparative analysis of 
sequences of biopolymers, for spell-checking in word processing, and for data retrieval 
from various databases. With an understanding of double-helix pair- wise complement 
interactions among 4 principal nucleobases in natural polynucleotides, models that 
simulate annealing of complementary homologous polynucleotide strings can also be used 
as a foundation of sequence alignment or other operations typically performed on the 
character strings corresponding to the sequences herein (e.g., word-processing 
manipulations, construction of figures comprising sequence or subsequence character 
strings, output tables, etc.). An example of a software package with GAs for calculating 
sequence similarity is BLAST, which can be adapted to the present invention by inputting 
character strings corresponding to the sequences herein. 

Similarly, standard desktop applications such as word processing software 
(e.g., Microsoft Word™ or Corel WordPerfect™) and database software (e.g., spreadsheet 
software such as Microsoft Excel™, Corel Quattro Pro™, or database programs such as 
Microsoft Access™ or Paradox™) can be adapted to the present invention by inputting a 
character string corresponding to the GAT homologues of the invention (either nucleic 
acids or proteins, or both). For example, the integrated systems can include the foregoing 
software having the appropriate character string information, e.g., used in conjunction with 
a user interface (e.g., a GUI in a standard operating system such as a Windows, Macintosh 
or LINUX system) to manipulate strings of characters. As noted, specialized alignment 
programs such as BLAST can also be incorporated into the systems of the invention for 
alignment of nucleic acids or proteins (or corresponding character strings). 

Integrated systems for analysis in the present invention typically include a 
digital computer with GA software for aligning sequences, as well as data sets entered into 
the software system comprising any of the sequences herein. The computer can be, e.g., a 
PC (Intel x86 or Pentium chip- compatible DOS™, OS2™ WINDOWS™ WINDOWS 
NT™, WINDOWS95™, WINDOWS98™ LINUX based machine, a MACINTOSH™, 
Power PC, or a UNIX based (e.g., SUN™ work station) machine) or other commercially 
common computer which is known to one of skill. Software for aligning or otherwise 
manipulating sequences is available, or can easily be constructed by one of skill using a 
standard programming language such as Visualbasic, Fortran, Basic, Java, or the like. 
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Any controller or computer optionally includes a monitor which is often a 
cathode ray tube ("CRT") display, a flat panel display (e.g., active matrix liquid crystal 
display, liquid crystal display), or others. Computer circuitry is often placed in a box 
which includes numerous integrated circuit chips, such as a microprocessor, memory, 
interface circuits, and others. The box also optionally includes a hard disk drive, a floppy 
disk drive, a high capacity removable drive such as a writeable CD-ROM, and other 
common peripheral elements. Inputting devices such as a keyboard or mouse optionally 
provide for input from a user and for user selection of sequences to be compared or 
otherwise manipulated in the relevant computer system. 

The computer typically includes appropriate software for receiving user 
instructions, either in the form of user input into a set parameter fields, e.g., in a GUI, or in 
the form of preprogrammed instructions, e.g., preprogrammed for a variety of different 
specific operations. The software then converts these instructions to appropriate language 
for instructing the operation of the fluid direction and transport controller to carry out the 
desired operation. 

The software can also include output elements for controlling nucleic acid 
synthesis (e.g., based upon a sequence or an alignment of a sequences herein) or other 
operations which occur downstream from an alignment or other operation performed using 
a character string corresponding to a sequence herein. Nucleic acid synthesis equipment 
can, accordingly, be a component in one or more integrated systems herein. 

In an additional aspect, the present invention provides kits embodying the 
methods, composition, systems and apparatus herein. Kits of the invention optionally 
comprise one or more of the following: (1) an apparatus, system, system component or 
apparatus component as described herein; (2) instructions for practicing the methods 
described herein, and/or for operating the apparatus or apparatus components herein 
and/or for using the compositions herein; (3) one or more GAT composition or 
component; (4) a container for holding components or compositions, and, (5) packaging 
materials. 

In a further aspect, the present invention provides for the use of any 
apparatus, apparatus component, composition or kit herein, for the practice of any method 
or assay herein, and/or for the use of .any apparatus or kit to practice any assay or method 
herein. 
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HOST CELLS AND ORGANISMS 

The host cell can be eukaryotic, for example, a eukaryotic cell, a plant cell, 
an animal cell, a protoplast, or a tissue culture. The host cell optionally comprises a 
plurality of cells, for example, an organism. Alternatively, the host cell can be prokaryotic 
5 including, but not limited to, bacteria (i.e., gram positive bacteria, purple bacteria, green 
sulfur bacteria, green non-sulfur bacteria, cyanobacteria, spirochetes, thermatogales, 
flavobacteria, and bacteroides) and archaebacteria (i.e., Korarchaeota, Thermoproteus, 
Pyrodictium, Thermococcales, methanogens, Archaeoglobus, and extreme halophiles). 

Transgenic plants, or plant cells, incorporating the GAT nucleic acids, 

10 and/or expressing the GAT polypeptides of the invention are a feature of the invention. 
The transformation of plant cells and protoplasts can be carried out in essentially any of 
the various ways known to those skilled in the art of plant molecular biology, including, 
but not limited to, the methods described herein. See, in general, Methods in Enzvmologv . 
Vol. 153 (Recombinant DNA Part D) Wu and Grossman (eds.) 1987, Academic Press, 

15 incorporated herein by reference. As used herein, the term "transformation" means 

alteration of the genotype of a host plant by the introduction of a nucleic acid sequence, 
e.g., a "heterologous" or "foreign" nucleic acid sequence. The heterologous nucleic acid 
sequence need not necessarily originate from a different source but it will, at some point, 
have been external to the cell into which is introduced. 

20 In addition to Berger, Ausubel and Sambrook, useful general references for 

plant cell cloning, culture and regeneration include Jones (ed) (1995) Plant Gene Transfer 
and Expression Protocols— Methods in Molecular Biology, Volume 49 Humana Press 
Towata NJ; Payne et al. (1992) Plant Cell and Tissue Culture in Liquid Systems John 
Wiley & Sons, Inc. New York, NY (Payne); and Gamborg and Phillips (eds) (1995) Plant 

25 Cell, Tissue and Organ Culture: Fundamental Methods Springer Lab Manual, Springer- 
Verlag (Berlin Heidelberg New York) (Gamborg). A variety of cell culture media are 
described in Atlas and Parks (eds) The Handbook of Microbiological Media T1993) CRC 
Press, Boca Raton, FL (Atlas). Additional information for plant cell culture is found in 
available commercial literature such as the Life Science Research Cell Culture Catalogue 

30 (1998) from Sigma- Aldrich, Inc (St Louis, MO) (Sigma-LSRCCC) and, e.g., the Plant 
Culture Catalogue and supplement (1997) also from Sigma-Aldrich, Inc (St Louis, MO) 
(Sigma-PCCS). Additional details regarding plant cell culture are found in Croy, (ed.) 
(1993) Plant Molecular Biology Bios Scientific Publishers, Oxford, U.K. 
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In an embodiment of this invention, recombinant vectors including one or 
more GAT polynucleotides, suitable for the transformation of plant cells are prepared. A 
DNA sequence encoding for the desired GAT polypeptide, e.g., selected from among SEQ 
ID NOS: 1-5 and 11-262, is conveniently used to construct a recombinant expression 
cassette which can be introduced into the desired plant. In the context of the present 
invention, an expression cassette will typically comprise a selected GAT polynucleotide 
operably linked to a promoter sequence and other transcriptional and translational 
initiation regulatory sequences which are sufficient to direct the transcription of the GAT 
sequence in the intended tissues (e.g., entire plant, leaves, roots, etc.) of the transformed 
plant. 

For example, a strongly or weakly constitutive plant promoter that directs 
expression of a GAT nucleic acid in all tissues of a plant can be favorably employed. 
Such promoters are active under most environmental conditions and states of development 
or cell differentiation. Examples of constitutive promoters include the 1'- or 2'- promoter 
of Agrobacterium tumefaciens, and other transcription initiation regions from various plant 
genes known to those of skill. Where overexpression of a GAT polypeptide of the 
invention is detrimental to the plant, one of skill, will recognize that weak constitutive 
promoters can be used for low-levels of expression. In those cases where high levels of 
expression is not harmful to the plant, a strong promoter, e.g., a t-RNA, or other pol m 
promoter, or a strong pol II promoter, (e.g., the cauliflower mosaic virus promoter, CaMV, 
35S promoter) can be used. 

Alternatively, a plant promoter can be under environmental control. Such 
promoters are referred to as "inducible" promoters. Examples of environmental 
conditions that may alter transcription by inducible promoters include pathogen attack, 
anaerobic conditions, or the presence of light. In some cases, it is desirable to use 
promoters that are "tissue-specific" and/or are under developmental control such that the 
GAT polynucleotide is expressed only in certain tissues or stages of development, e.g., 
leaves, roots, shoots, etc. Endogenous promoters of genes related to herbicide tolerance 
and related phenotypes are particularly useful for driving expression of GAT nucleic acids, 
e.g., P450 monooxygenases, glutathione-S-transferases, homoglutathione-S-transferases, 
glyphosate oxidases and 5-enolpyruvylshikimate-2-phosphate synthases. 

Tissue specific promoters can also be used to direct expression of 
heterologous structural genes, including the GAT polynucleotides described herein. Thus 

the promoters can be used in recombinant expression cassettes to drive expression of any 
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gene whose expression is desirable in the transgenic plants of the invention, e.g., GAT 
and/or other genes conferring herbicide resistance or tolerance, genes which influence 
other useful characteristics, e.g., heterosis. Similarly, enhancer elements, e.g., derived 
from the 5' regulatory sequences or intron of a heterologous gene, can also be used to 
5 improve expression of a heterologous structural gene, such as a GAT polynucleotide. 

In general, the particular promoter used in the expression cassette in plants 
depends on the intended application. Any of a number of promoters which direct 
transcription in plant cells can be suitable. The promoter can be either constitutive or 
inducible. In addition to the promoters noted above, promoters of bacterial origin which 

10 operate in plants include the octopine synthase promoter, the nopaline synthase promoter 
and other promoters derived from Ti plasmids. See, Herrera-Estrella et al. (1983) Nature 
303:209. Viral promoters include the 35S and 19S RNA promoters of CaMV. See, Odell 
et al., (1985) Nature 313:810. Other plant promoters include the ribulose-1,3- 
bisphosphate carboxylase small subunit promoter and the phaseolin promoter. The 

15 promoter sequence from the E8 gene (see, Deikman and Fischer (1988) EMBOJ 7:3315) 
and other genes are also favorably used. Promoters specific for monocotyledonous species 
are also considered (McElroy D., Brettell R.I.S. 1994. Foreign gene expression in 
transgenic cereals. Trends Biotech., 12:62-68.) Alternatively, novelpromoters with 
useful characteristics can be identified from any viral, bacterial, or plant source by 

20 methods, including sequence analysis, enhancer or promoter trapping, and the like, known 
in the art. 

In preparing expression vectors of the invention, sequences other than the 
promoter and the GAT encoding gene are also favorably used. If proper polypeptide 
expression is desired, a polyadenylation region can be derived from the natural gene, from 

25 a variety of other plant genes, or from T-DNA. Signal/localization peptides, which, e.g., 
facilitate translocation of the expressed polypeptide to internal organelles (e.g., 
chloroplasts) or extracellular secretion, can also be employed. 

The vector comprising the GAT polynucleotide also can include a marker 
gene which confers a selectable phenotype on plant cells. For example, the marker may 

30 encode biocide tolerance, particularly antibiotic tolerance, such as tolerance to kanamycin, 
G418, bleomycin, hygromycin, or herbicide tolerance, such as tolerance to chlorosulfuron, 
or phophinothricin. Reporter genes, which are used to monitor gene expression and 
protein localization via visualizable reaction products (e.g., beta-glucuronidase, beta- 
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galactosidase, and chloramphenicol acetyltransferase) or by direct visualization of the 
gene product itself (e.g., green fluorescent protein, GFP; Sheen et al. (1995) The Plant 
Journal 8:777) can be used for, e.g., monitoring transient gene expression in plant cells. 
Transient expression systems can be employed in plant cells, for example, in screening 
5 plant cell cultures for herbicide tolerance activities. 

PLANT TRANSFORMATION 
Protoplasts 

Numerous protocols for establishment of transformable protoplasts from a 
variety of plant types and subsequent transformation of the cultured protoplasts are 
10 available in the art and are incorporated herein by reference. For examples, see, 

Hashimoto et al. (1990) Plant Physiol. 93:857; Fowke and Constabel (eds)(1994) Plant 
Protoplasts ; Saunders et al. (1993) Applications of Plant In Vitro Technology Symposium , 
UPM 16-18; and Lyznik et al. (1991) BioTechniques 10:295, each of which is 
incorporated herein by reference. 

15 Chloroplasts 

Chloroplasts are a site of action of some herbicide tolerance activities, and, 

in some instances, the GAT polynucleotide is fused to a chloroplast transit sequence 

peptide to facilitate translocation of the gene products into the chloroplasts. In these cases, 

it can be advantageous to transform the GAT polynucleotide into the chloroplasts of the 

20 plant host cells. Numerous methods are available in the art to accomplish chloroplast 
transformation and expression (e.g., Daniell et al. (1998) Nature Biotechnology 16:346; 
O'Neill et al. (1993) The Plant Journal 3:729; Maliga (1993) TEBTECH 11:1). The 
expression construct comprises a transcriptional regulatory sequence functional in plants 
operably linked to a polynucleotide encoding the GAT polypeptide. Expression cassettes 

25 that are designed to function in chloroplasts (such as an expression cassette including a 
GAT polynucleotide) include the sequences necessary to ensure expression in 
chloroplasts. Typically, the coding sequence is flanked by two regions of homology to the 
chloroplastid genome to effect a homologous recombination with the chloroplast genome; 
often a selectable marker gene is also present within the flanking plastid DNA sequences 

30 to facilitate selection of genetically stable transformed chloroplasts in the resultant 

transplastonic plant cells {see, e.g., Maliga (1993) and Daniell (1998), and references cited 
therein). 
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General transformation methods 

DNA constructs of the invention can be introduced into the genome of the 
desired plant host by a variety of conventional techniques. Techniques for tranforming a 
wide variety of higher plant species are well known and described in the technical and 

5 scientific literature. See, e.g., Payne, Gamborg, Croy, Jones, etc. all supra, as well as, e.g., 
Weising et al. (1988) Ann. Rev. Genet. 22:421. 

For example, DNAs can be introduced directly into the genomic DNA of a 
plant cell using techniques such as electroporation and microinjection of plant cell 
protoplasts, or the DNA constructs can be introduced direcdy to plant tissue using ballistic 

10 methods, such as DNA particle bombardment. Alternatively, the DNA constructs can be 
combined with suitable T-DNA flanking regions and introduced into a conventional 
Agrobacterium tumefaciens host vector. The virulence functions of the Agrobacterium 
host will direct the insertion of the construct and adjacent marker into the plant cell DNA 
when the plant cell is infected by the bacteria. 

15 Microinjection techniques are known in the art and well described in the 

scientific and patent literature. The introduction of DNA constructs using polyethylene 
glycol precipitation is described in Paszkowski et al (1984) EMBO J 3:2717. 
Electroporation techniques are described in Fromm et al. (1985) Froc Nat'l Acad Sci USA 
82:5824. Ballistic transformation techniques are described in Klein et al. (1987) Nature 

20 327:70; and Weeks et al. Plant Phvsiol 102: 1077, 

In some embodiments, Agrobacterium mediated transformation techniques 
are used to transfer the GAT sequences of the invention to transgenic plants. 
Agrobacterium-mediated transformation is widely used for the transformation of dicots, 
however, certain monocots can also be transformed by Agrobacterium. For example, 

25 Agrobacterium transformation of rice is described by Hiei et al. (1994) Plant J. 6:271 ; US 
Patent No. 5,187,073; US Patent No. 5,591,616; Li et al. (1991) Science in China3 4:54: 
and Raineri et al. (1990) Bio/Technology 8:33. Transformed maize, barley, triticale and 
asparagus by Agrobacterium mediated transformation have also been described (Xu et al. 
(1990) Chinese J Bot 2:81). 

30 Agrobacterium mediated transformation techniques take advantage of the 

ability of the tumor-inducing (Ti) plasmid of A. tumefaciens to integrate into a plant cell 
genome, to co-transfer a nucleic acid of interest into a plant cell. Typically, an expression 
vector is produced wherein the nucleic acid of interest, such as a GAT polynucleotide of 
the invention, is ligated into an autonomously replicating plasmid which also contains T- 
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DNA sequences. T-DNA sequences typically flank the expression casssette nucleic acid 
of interest and comprise the integration sequences of the plasmid. In addition to the 
expression cassette, T-DNA also typically include a marker sequence, e.g., antibiotic 
resistance genes. The plasmid with the T-DNA and the expression cassette are then 
5 transfected into Agrobacterium cells. Typically, for effective tranformation of plant cells, 
the A. tumefaciens bacterium also possesses the necessary vir regions on a plasmid, or 
integrated into its chromosome. For a discussion of Agrobacterium mediated 
transformation, see, Firoozabady and Kuehnle, (1995) Plant Cell Tissue and Organ 
Culture Fundamental Methods , Gamborg and Phillips (eds.). 

10 Regeneration of Transgenic Plants 

Transformed plant cells which are derived by plant transformation 

techniques, including those discussed above, can be cultured to regenerate a whole plant 

which possesses the transformed genotype (i.e., a GAT polynucleotide), and thus the 

desired phenotype, such as acquired resistance (i.e., tolerance) to glyphosate or a 

15 glyphosate analog. Such regeneration techniques rely on manipulation of certain 

phytohormones in a tissue culture growth medium, typically relying on a biocide and/or 
herbicide marker which has been introduced together with the desired nucleotide 
sequences. Alternatively, selection for glyphosate resistance conferred by the GAT 
polynucleotide of the invention can be performed. Plant regeneration from cultured 

20 protoplasts is described in Evans et al. (1983) Protoplasts Isolation an d Culture. Handbook 
of Plant Cell Culture , pp 124-176, Macmillan Publishing Company, New York; and 
Binding (1985) Regeneration of Plants, Plant Protoplasts pp 21-73, CRC Press, Boca 
Raton. Regeneration can also be obtained from plant callus, explants, organs, or parts 
thereof. Such regeneration techniques are described generally in Klee et al. (1987) Ann 

25 Rev of Plant Phvs 38:467. See also, e.g., Payne and Gamborg. After transformation with 
Agrobacterium, the explants typically are transferred to selection medium. One of skill 
will realize that the selection medium depends on the selectable marker that was co- 
transfected into the explants. After a suitable length of time, transformants will begin to 
form shoots. After the shoots are about 1-2 cm in length, the shoots should be transferred 

30 to a suitable root and shoot medium. Selection pressure should be maintained in the root 

and shoot medium. 

Typically, the transformants will develop roots in about 1-2 weeks and 

form plantlets. After the plantlets are about 3-5 cm in height, they are placed in sterile soil 

in fiber pots. Those of skill in the art will realize that different acclimation procedures are 
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used to obtain transformed plants of different species. For example, after developing a 
root and shoot, cuttings, as well as somatic embryos of transformed plants, are transferred 
to medium for establishment of plantlets. For a description of selection and regeneration 
of transformed plants, see, e.g., Dodds and Roberts (1995) Experiments in Plant Tissue 
5 Culture . 3 rd Ed., Cambridge University Press. 

There are also methods for Agrobacterium transformation of Arabidopsis 
using vacuum infiltration (Bechtold N., Ellis J. and Pelletier G„ 1993, In planta 
Agrobacterium mediated gene transfer by infiltration of adult Arabidopsis thaliana plants. 
CR Acad Sci Paris life Sci 316:1194-1199) and simple dipping of flowering plants 

10 (Desfeux, C, Clough S.J., and Bent A.F., 2000, Female reproductive tissues are the 

primary target of Agrobacterium-mediated transformation by the Arabidopsis floral-dip 
method. Plant Physiol. 123:895-904). Using these methods, transgenic seed are produced 
without the need for tissue culture. 

There are plant varieties for which effective Agrobacterium-mediated 

15 transformation protocols have yet to be developed. For example, successful tissue 

transformation coupled with regeneration of the transformed tissue to produce a transgenic 
plant has not been reported for some of the most commercially relevant cotton cultivars. 
Nevertheless, an approach that can be used with these plants involves stably introducing 
the polynucleotide into a related plant variety via Agrobacterium-mediated transformation, 

20 confirming operability, and then transferring the transgene to the desired commercial 
strain using standard sexual crossing or back-crossing techniques. For example, in the 
case of cotton, Agrobacterium can be used to transform a Coker line of Gossypium 
hirustum (e.g., Coker lines 310, 312, 5110 Deltapine 61 or Stoneville 213), and then the 
transgene can be introduced into another more commercially relevant G. hirustum cultivar 

25 by back-crossing. 

The transgenic plants of this invention can be characterized either 
genotypically or phenotypically to determine the presence of the GAT polynucleotide of 
the invention. Genotypic analysis can be performed by any of a number of well-known 
techniques, including PCR amplification of genomic DNA and hybridization of genomic 

30 DNA with specific labeled probes. Phenotypic analysis includes, e.g., survival of plants or 

plant tissues exposed to a selected herbicide such as glyphosate. 

Essentially any plant can be transformed with the GAT polynucleotides of 

the invention. Suitable plants for the transformation and expression of the novel GAT 

polynucleotides of this invention include agronomically and horticulturally important 
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species. Such species include, but are not restricted to members of the families: Graminae 
(including corn, rye, triticale, barley, millet, rice, wheat, oats, etc.); Leguminosae 
(including pea, beans, lentil, peanut, yam bean, cowpeas, velvet beans, soybean, clover, 
alfalfa, lupine, vetch, lotus, sweet clover, wisteria, and sweetpea); Compositae (the largest 
5 family of vascular plants, including at least 1,000 genera, including important commercial 
crops such as sunflower) and Rosaciae (including raspberry, apricot, almond, peach, rose, 
etc.), as well as nut plants (including, walnut, pecan, hazelnut, etc), and forest trees 
(including Pinus, Quercus, Pseutotsuga, Sequoia, Populus,etc.) 

Additional targets for modification by the GAT polynucleotides of the 

10 invention, as well as those specified above, include plants from the genera: Agrostis, 
Allium, Antirrhinum, Apium, Arachis, Asparagus, Atropa, Avena (e.g., oats), Bambusa, 
Brassica, Bromus, Browaalia, Camellia, Cannabis, Capsicum, Cicer, Chenopodium, 
Chichorium, Citrus, Coffea, Coix, Cucumis, Curcubita, Cynodon, Dactylis, Datura, 
Daucus, Digitalis, Dioscorea, Elaeis, Eleusine, Festuca, Fragaria, Geranium, Gossypium, 

15 Glycine, Helianthus, Heterocallis, Hevea, Hordeum (e.g., barley), Hyoscyamus, Ipomoea, 
Lactuca, Lens, Lilium, Linum, Lolium, Lotus, Lycopersicon, Majorana, Malus, Mangifera, 
Manihot, Medicago, Nemesia, Nicotiana, Onobrychis, Oryza (e.g., rice), Panicum, 
Pelargonium, Pennisetum (e.g., millet), Petunia, Pisum, Phaseolus, Phleum, Poa y Prunus, 
Ranunculus, Raphanus, Ribes, Ricinus, Rubus, Saccharum, Salpiglossis, Secale (e.g., rye), 

20 Senecio, Setaria, Sinapis, Solanum, Sorghum, Stenotaphrum y Theobroma, Trifolium, 

Trigonella, Triticum (e.g., wheat), Vicia, Vigna, Vitis, Zea (e.g., corn), and the Olyreae, 
the Pharoideae and many others. As noted, plants in the family Graminae are a 
particularly target plants for the methods of the invention. 

Common crop plants which are targets of the present invention include 

25 corn, rice, triticale, rye, cotton, soybean, sorghum, wheat, oats, barley, millet, sunflower, 
canola, peas, beans, lentils, peanuts, yam beans, cowpeas, velvet beans, clover, alfalfa, 
lupine, vetch, lotus, sweet clover, wisteria, sweetpea and nut plants (e.g., walnut, pecan, 
etc). 

In one aspect, the invention provides a method for producing a crop by 

30 growing a crop plant that is glyphosate-tolerant as a result of being transformed with a 

gene encoding a glyphosate N-acteyltransferase, under conditions such that the crop plant 

produces a crop, and harvesting the crop. Preferably, glyphosate is applied to the plant, or 

in the vicinity of the plant, at a concentration effective to control weeds without preventing 

the transgenic crop plant from growing and producing the crop. The application of 
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glyphosate can be before planting, or at any time after planting up to and including the 
time of harvest. Glyphosate can be applied once or multiple times. The timing of 
glyphosate application, amount applied, mode of application, and other parameters will 
vary based upon the specific nature of the crop plant and the growing environment, and 
5 can be readily determined by one of skill in the art. The invention further provides the 
crop produced by this method. 

The invention provides for the propagation of a plant containing a GAT 
polynucleotide transgene. The plant can be, for example, a monocot or a dicot. In one 
aspect, propagation entails crossing a plant containing a GAT polynucleotide transgene 
10 with a second plant, such that at least some progeny of the cross display glyphosate 
tolerance. 

In one aspect, the invention provides a method for selectively controlling 
weeds in a field where a crop is being grown. The method involves planting crop seeds or 
plants that are glyphosate-tolerant as a result of being transformed with a gene encoding a 

15 GAT, e.g., a GAT polynucleotide, and applying to the crop and any weeds a sufficient 
amount of glyphosate to control the weeds without a significant adverse impact on the 
crops. It is important to note that it is not necessary for the crop to be totally insensitive to 
the herbicide, so long as the benefit derived from the inhibition of weeds outweighs any 
negative impact of the glyphosate or glyphosate analog on the crop or crop plant. 

20 In another aspect, the invention provides for use of a GAT polynucleotide 

as a selectable marker gene. In this embodiment of the invention, the presence of the GAT 
polynucleotide in a cell or organism confers upon the cell or organism the detectable 
phenotypic trait of glyphosate resistance, thereby allowing one to select for cells or 
organisms that have been transformed with a gene of interest linked to the GAT 

25 polynucleotide. Thus, for example, the GAT polynucleotide can be introduced into a 

nucleic acid construct, e.g., a vector, thereby allowing for the identification of a host (e.g., 
a cell or transgenic plant) containing the nucleic acid construct by growing the host in the 
presence of glyphosate and selecting for the ability to survive and/or grow at a rate that is 
discernibly greater than a host lacking the nucleic acid construct would survive or grow. 

30 A GAT polynucleotide can be used as a selectable marker in a wide variety of hosts that 

are sensitive to glyphosate, including plants, most bacteria (including E. coli), 

actinomycetes, yeasts, algae and fungi. One benefit of using herbicide resistance as a 

marker in plants, as opposed to conventional antibiotic resistance, is that it obviates the 

concern of some members of the public that antibiotic resistance might escpe into the 
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environment. Some experimental data from experiments demonstrating the use of a GAT 
polynucleotide as a selectable marker in diverse host systems are described in the 
Examples section of this specification. 

Selection of gat polynucleotides conferring enhanced glyp hosate resistance 
in transgenic plants. 

libraries of GAT encoding nucleic acids diversified according to the 
methods described herein can be selected for the ability to confer resistance to glyphosate 
in transgenic plants. Following one or more cycles of diversification and selection, the 
modified GAT genes can be used as a selection marker to facilitate the production and 
evaluation of transgenic plants and as a means of conferring herbicide resistance in 
experimental or agricultural plants. For example, after diversification of any one or more 
of SEQ ID NO:l to SEQ ID NO:5 to produce a library of diversified GAT 
polynucleotides, an initial functional evaluation can be performed by expressing the 
library of GAT encoding sequences in E. coli. The expressed GAT polypeptides can be 
purified, or partially purified as described above, and screened for improved kinetics by 
mass spectrometry. Following one or more preliminary rounds of diversification and 
selection, the polynucleotides encoding improved GAT polypeptides are cloned into a 
plant expression vector, operably linked to, e.g., a strong constitutive promoter, such as the 
CaMV 35S promoter. The expression vectors comprising the modified GAT nucleic acids 
are transformed, typically by Agrobacterium mediated transformation, into Arabidopsis 
thaliana host plants. For example, Arabidopsis hosts are readily transformed by dipping 
inflorescences into solutions of Agrobacterium and allowing them to grow and set seed. 
Thousands of seeds are recovered in approximately 6 weeks. The seeds are then collected 
in bulk from the dipped plants and germinated in soil. In this manner it is possible to 
generate several thousand independently transformed plants for evaluation, constituting a 
high throughput (HTP) plant transformation format. Bulk grown seedlings are sprayed 
with glyphosate and surviving seedlings exhibiting glyphosate resistance survive the 
selection process, whereas non-transgenic plants and plants incorporating less favorable 
modified GAT nucleic acids are damaged or killed by the herbicide treatment. Optionally, 
the GAT encoding nucleic acids conferring improved resistance to glyphosate are 
recovered, e.g., by PCR amplification using T-DNA primers flanking the library inserts, 
and used in further diversification procedures or to produce additional transgenic plants of 
the same or different species. If desired, additional rounds of diversification and selection 
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can be performed using increasing concentrations of glyphosate in each subsequent 
selection. In this manner, GAT polynucleotides and polypeptides conferring resistance to 
concentrations of glyphosate useful in field conditions can be obtained. 
Herbicide Resistance 

5 The mechanism of glyphosate resistance of the present invention can be 

combined with other modes of glyphosate resistance known in the art to produce plants 
and plant explants with superior glyphosate resistance. For example, glyphosate-tolerant 
plants can be produced by inserting into the genome of the plant the capacity to produce a 
higher level of 5-enolpymvylshikimate-3-phosphate synthase (EPSP) as more fully 

10 described in U.S. Patent Nos. 6,248,876 Bl; 5,627,061; 5,804,425; 5,633,435; 5,145,783; 
4,971,908; 5,312,910; 5,188,642; 4,940,835; 5,866,775; 6,225,114 Bl; 6,130,366; 
5,310,667; 4,535,060; 4,769,061; 5,633,448; 5,510,471; Re. 36,449; RE 37,287 E; and 
5,491,288; and international publications WO 97/04103; WO 00/66746; WO 01/66704; 
and WO 00/66747, which are incorporated herein by reference in their entireties for all 

15 purposes. Glyphosate resistance is also imparted to plants that express a gene that encodes 
a glyphosate oxido-reductase enzyme as described more fully in U.S. Patent Nos. 
5,776,760 and 5,463,175, which are incorporated herein by reference in their entireties for 
all purposes. 

Further, the mechanism of glyphosate resistance of the present invention 

20 may be combined with other modes of herbicide resistance to provide plants and plant 

explants that are resistant to glyphosate and one or more other herbicides. For example, 

the hydroxyphenylpyruvatedioxygenases are enzymes that catalyze the reaction in which 

para-hydroxyphenylpyruvate (HPP) is transformed into homogentisate. Molecules which 

inhibit this enzyme, and which bind to the enzyme in order to inhibit transformation of the 

25 HPP into homogentisate are useful as herbicides. Plants more resistant to certain 

herbicides are described in U.S Patent Nos. 6,245,968 Bl; 6,268,549; and 6,069,115; and 

international publication WO 99/23886, which are incorporated herein by reference in 

their entireties for all purposes. 

Sulfonylurea and imidazolinone herbicides also inhibit growth of higher 

30 plants by blocking acetolactate synthase (ALS) or acetohydroxy acid synthase (AHAS). 

The production of sulfonylurea and imidazolinone tolerant plants is described more fully 

in U.S Patent Nos. 5,605,011; 5,013,659; 5,141,870; 5,767,361; 5,731,180; 5,304,732; 

4,761,373; 5,331,107; 5,928,937; and 5,378,824; and international publication WO 

96/33270, which are incorporated herein by reference in their entireties for all purposes. 
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Glutamine synthetase (GS) appears to be an essential enzyme necessary for 
the development and life of most plant cells. Inhibitors of GS are toxic to plant cells. 
Glufosinate herbicides have been developed based on the toxic effect due to the inhibition 
of GS in plants. These herbicides are non-selective. They inhibit growth of all the 
5 different species of plants present, causing their total destruction. The development of 
plants containing an exogenous phosphinothricin acetyl transferase is described in U.S. 
Patent Nos. 5,969,213; 5,489,520; 5,550,318; 5,874,265; 5,919,675; 5,561,236; 5,648,477; 
5,646,024; 6,177,616 Bl; and 5,879,903, which are incorporated herein by reference in 
their entireties for all purposes. 

10 Protoporphyrinogen oxidase (protox) is necessary for the production of 

chlorophyll, which is necessary for all plant survival. The protox enzyme serves as the 
target for a variety of herbicidal compounds. These herbicides also inhibit growth of all 
the different species of plants present, causing their total destruction. The development of 
plants containing altered protox activity which are resistant to these herbicides are 

15 described in U.S. Patent Nos. 6,288,306 Bl; 6,282,837 Bl; and 5,767,373; and 

international publication WO 01/12825, which are incorporated herein by reference in 
their entireties for all purposes. 

EXAMPLES 

The following examples are illustrative and not limiting. One of skill will 
20 recognize a variety of non-critical parameters that can be altered to achieve essentially 
similar results. 

EXAMPLE 1: ISOLATING NOVEL NATIVE GAT POLYNUCLEOTIDES 

Five native GAT polynucleotides (i.e., GAT polynucleotides that occur 
naturally in a non-genetically modified organism) were discovered by expression cloning 

25 of sequences from Bacillus strains exhibiting GAT activity. Their nucleotide sequences 
were determined and are provided herein as SEQ ID NO: 1 to SEQ ID NO:5. Briefly, a 
collection of approximately 500 Bacillus and Pseudomonas strains were screened for 
native ability to N-acetylate glyphosate. Strains were grown in LB overnight, harvested 
by centrifugation, permeabilizied in dilute toluene, and then washed and resuspended in a 

30 reaction mix containing buffer, 5 mM glyphosate, and 200 \JtM acetyl-CoA. The cells 
were incubated in the reaction mix for between 1 and 48 hours, at which time an equal 
volume of methanol was added to the reaction. The cells were then pelleted by 
centrifugation and the supernatant was filtered before analysis by parent ion mode mass 
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spectrometry. The product of the reaction was positively identified as N-acetylglyphosate 
by comparing the mass spectrometry profile of the reaction mix to an N-acetylglyphosate 
standard as shown in Figure 2. Product detection was dependent on inclusion of both 
substrates (acetylCoA and glyphosate) and was abolished by heat denaturing the bacterial 
cells. 

Individual GAT polynucleotides were then cloned from the identified 
strains by functional screening. Genomic DNA was prepared and partially digested with 
Sau3Al enzyme. Fragments of approximately 4 Kb were cloned into an E. coli expression 
vector and transformed into electrocompetent E. coli. Individual clones exhibiting GAT 
activity were identified by mass spectrometry following a reaction as described previously 
except that the toluene wash was replaced by penneabilization with PMBS. Genomic 
fragments were sequenced and the putative GAT polypeptide-encoding open reading 
frame identified. Identity of the GAT gene was confirmed by expression of the open 
reading frame in E. coli and detection of high levels of N-acetylglyphosate produced from 
reaction mixtures. 

EXAMPLE 2: CHARACTERIZATION OF A GAT POLYPEPTIDE ISOLATED 
FROM B.LICHENIFORMIS STRAIN B6. 

Genomic DNA from B. licheniformis strain B6 was purified, partially 
digested with Sau3Al and fragments of 1-10 Kb were cloned into an E. coli expression 
vector. A clone with a 2.5 kb insert conferred the glyphosate N-acetyltransferase (GAT) 
activity on the E. coli host as determined with mass spectrometry analysis. Sequencing of 
the insert revealed a single complete open reading frame of 441 base pairs. Subsequent 
cloning of this open reading frame confirmed that it encoded the GAT enzyme. A 
plasmid, pMAXY2120, shown in figure 4, with the gene encoding the GAT enzyme of 
B6 was transformed into E. coli strain XL1 Blue. A 10% innoculum of a saturated culture 
was added to Luria broth, and the culture was incubated at 37° C for 1 hr. Expression of 
GAT was induced by the addition of IPTG at a concentration of 1 mM. The culture was 
incubated a further 4 hrs, following which, cells were harvested by centrifugation and the 
cell pellet stored at -80° C. 

Lysis of the cells was effected by the addition of 1 ml of the following 
buffer to 0.2 g of cells: 25 mM HEPES, pH 7.3, 100 mM KC1 and 10% methanol (HKM) 
plus 0.1 mM EDTA, 1 mM DTT, 1 mg/ml chicken egg lysozyme, and a protease inhibitor 
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cocktail obtained from Sigma and used according to the manufacturer's recommendations. 
After 20 minutes incubation at room temperature (e.g., 22-25° C), lysis was completed 
with brief sonication. The lysate was centrifuged and the supernatant was desalted by 
passage through Sephadex G25 equilibrated with HKM. Partial purification was obtained 
by affinity chromatography on CoA Agarose (Sigma). The column was equilibrated with 
HKM and the clarified extract allowed to pass through under hydrostatic pressure. Non- 
binding proteins were removed by washing the column with HKM, and GAT was eluted 
with HKM containing 1 mM Coenzyme A. This procedure provided 4-fold purification. 
At this stage, approximately 65% of the protein staining observed on an SDS 
polyacrylamide gel loaded with crude lysate was due to GAT, with another 20% due to 
chloramphenicol acetyltransf erase encoded by the vector. 

Purification to homogeneity was obtained by gel filtration of the partially 
purified protein through Superdex 75 (Pharmacia). The mobile phase was HKM, in which 
GAT activity eluted at a volume corresponding to a molecular radius of 17 kD. This 
material was homogeneous as judged by Coomassie staining of a 3 |xg sample of GAT 
subjected to SDS polyacrylamide gel electrophoresis on a 12% acrylamide gel, 1 mm 
thickness. Purification was achieved with a 6-fold increase in specific activity. 

The apparent Km for glyphosate was determined on reaction mixtures containing 
saturating (200 \JiM) Acetyl CoA, varying concentrations of glyphosate, and 1 purified 
GAT in buffer containing 5 mM morpholine adjusted to pH 7.7 with acetic acid and 20 % 
ethylene glycol. Initial reaction rates were determined by continuous monitoring of the 
hydrolysis of the thioester bond of Acetyl CoA at 235 nm (E = 3.4 OD/mM/cm). 
Hyperbolic saturation kinetics were observed (Figure 5), from which an apparent K M of 
2.9 ± 0.2 (SD) mM was obtained. 

The apparent K M for AcCoA was determined on reaction mixtures 
containing 5 mM glyphosate, varying concentrations of Acetyl CoA, and 0.19 MM GAT in 
buffer containing 5 mM morpholine adjusted to pH 7.7 with acetic acid and 50% 
methanol. Initial reaction rates were determined using mass spectrometric detection of N- 
acetyl glyphosate. Five \i\ were repeatedly injected to the instrument and reaction rates 
were obtained by plotting reaction time vs area of the integrated peak (Figure 6). 
Hyperbolic saturation kinetics were observed (Figure 7), from which an apparent K M of 2 
MM was derived. From values for Vmax obtained at a known concentration of enzyme, a 
kcat of 6/min was calculated. 
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EXAMPLE 3: MASS SPECTROMETRY (MS^ SCR FKNTNO PROCESS 

Sample (5 ul) is drawn from a 96-well microliter plate at a speed of one 
sample every 26 seconds and injected into the mass spectrometer (Micromass Quattro LC, 
triple quadrupole mass spectrometer) without any separation. The sample is carried into 
the mass spectrometer by a mobile phase of water/methanol (50:50) at a flow rate of 500 
Ul/min. Each injected sample is ionized by negative electrospray ionization process 
(needle voltage, -3.5 KV; cone voltage, 20 V; source temperature, 120 C; desolvation 
temperature, 250 C; cone gas flow, 90 L/Hr; and desolvation gas flow, 600 L/Hr). The 
molecular ions (m/z 210) formed during this process arre selected by the first quadrupole 
for performing collison induced dissociation (CID) in the second quadrupole, where the 
pressure is set at 5 x 10" 4 mBar and the collision energy is adjusted to 20 Ev. The third 
quadrupole is set for only allowing one of the daughter ions (m/z 124) produced from the 
parent ions (m/z 210) to get into the detector for signal recording. The first and third 
quadupoles are set at unit resolution, while the photomultiplier is operated at 650 V. Pure 
N-acetylglyphosate standards are used for comparison and peak integration used to 
estimate concentrations. It is possible to detect less than 200 Nm N-acetylglyphosate by 
this method. 

EXAMPLE 4: DETECTION OF NATIVE OR LOW ACTIVITY GAT ENZY MES 

Native or low activity GAT enzymes typically have Kcat of approximately 
1 min 1 and K M for glyphosate of 1.5-10 Mm. K M for acetylCoA is typically less than 25 
pM. 

Bacterial cultures are grown in rich medium in deep 96-well plates and 0.5 
ml stationary phase cells are harvested by centrifugation, washed with 5 mM morpholine 
acetate pH 8, and resuspended in 0.1 mi reaction mix containing 200 pM ammonium 
acetylCoA, 5 mM ammonium glyphosate, and 5 [ig/ml PMBS (Sigma) in 5 mM 
morpholine acetate, pH 8. The PMBS permeabilizes the cell membrane allowing the 
substrates and products to move from the cells to the buffer without releasing the entire 
cellular contents. Reactions are carried out at 25-37°C for 1-48 hours. The reactions are 
quenched with an equal volume of 100% ethanol and the entire mixture is filtered on a 
0.45 pm MAHV Multiscreen filter plate (Millipore). Samples are analyzed using a mass 
spectrometer as desribed above and compared to synthetic N-acetylglyphosate standards. 
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EXAMPLE 5: DETECTION OF HIGH ACTIVITY GAT ENZYMES 

High activity GAT enzymes typically have kcat up to 400 min 1 and K M 
below 0.1 mM glyphosate. 

Genes coding for GAT enzymes are cloned into E. coli expression vectors 
5 such as pQE80 (Qiagen) and introduced into E. coli strains such as XL1 Blue (Stratagene). 
Cultures are grown in 150 ul rich medium (such as LB with 50 ug/ml carbenicllin) in 
shallow U-bottom 96-well polystyrene plates to late-log phase and diluted 1:9 with fresh 
medium containing 1 mM IPTG (USB). After 4-8 hours induction, cells are harvested, 
washed with 5mM morpholine acetate pH 6.8 and resuspended in an equal volume of the 
10 same morpholine buffer. Reactions are carried out with up to 10 ul of washed cells. At 
higher activity levels, the cells are first diluted up to 1:200 and 5 ul is added tolOO ul 
reaction mix. To measure GAT activity, the same reaction mix as described for low 
activity can be used However, for detecting highly active GAT enzymes the glyphosate 
concentration is reduced to 0.15 - 0.5 mM, the pH is reduced to 6.8, and reactions are 
15 carried out for 1 hour at 37°C. Reaction workup and MS detection are as described herein. 

FX AMPLE 6: PURIFICATION OF GAT ENZYMES 

Enzyme purification is achieved by affinity chromatography of cell lysates 
on CoA-agarose and gel-filtration on Superdex-75. Quantities of purified GAT enzyme up 

20 to 10 mg are obtained as follows: A 100-ml culture of E. coli carrying a GAT 

polynucleotide on a pQE80 vector and grown overnight in LB containing 50 ug/ml 
carbenicillin is used to inoculate 1 L of LB plus 50 ug/ml carbenicillin. After 1 hr, IPTG 
is added to 1 mM, and the culture is grown a further 6 hr. Cells are harvested by 
centrifugation. Lysis is effected by suspending the cells in 25 mM HEPES (pH 7.2), 100 

25 mM KC1, 10% methanol (termed HKM), 0. 1 mM EDTA, 1 mM DTT, protease inhibitor 
cocktail supplied by Sigma- Aldrich and 1 mg/ml of chicken egg lysozyme. After 30 
minutes at room temperature, the cells are briefly sonicated. Particulate material is 
removed by centrifugation, and the lysate is passed through a bed of coenzyme A- 
Agarose. The column is washed with several bed volumes of HKM and GAT is eluted in 

30 1.5 bed volumes of HKM containing 1 mM acetyl-coenzyme A. GAT in the eluate is 

concentrated by its retention above a Centricon YM 50 ultrafiltration membrane. Further 
purification is obtained by passing the protein through a Superdex 75 column through a 
series of 0.6-ml injections. The peak of GAT activity elutes at a volume corresponding to 
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a molecular weight of 17 kD. This method results in purification of GAT enzyme to 
homogeneity with >85% recovery. A similar procedure is used to obtain 0.1 to 0.4 mg 
quantities of up to 96 shuffled variants at a time. The volume of induced culture is 
reduced to 1 to 10 ml, coenzyme A- Agarose affinity chromatography is performed in 0.15- 
ml columns packed in an MAHV filter plate (Millipore) and Superdex 75 chromatography 
is omitted. 

EXAMPLE 7: STANDARD PROTOCOL FOR DETERMINATION OF Ktat AND Km 

Kcat and Km for glyphosate of purified protein are determined using a 
continuous spectrophotometric assay, in which hydrolysis of the sulf oester bond of 
AcCoA is monitored at 235 nm. Reactions are performed at ambient temperature (about 
23°C) in the wells of a 96-well assay plate, with the following components present in a 
final volume of 0.3 ml: 20 mM HEPES, pH 6.8, 10% ethylene glycol, 0.2 mM acetyl 
coenzyme A, and various concentration of ammonium glyphosate. In comparing the 
kinetics of two GAT enzymes, both enzymes should be assayed under the same condition, 
e.g:, both at 23°C. Kcat is calculated from V max and the enzyme concentration, determined 
by Bradford assay. K M is calculated from the initial reaction rates obtained from 
concentrations of glyphosate ranging from 0.125 to 10 mM, using the Lineweaver-Burke 
transformation of the Michaelis-Menten equation. K^/Km is determined by dividing the 
value determined for Kcat by the value determined for K M . 

Using this methodology, kinetic parameters for a number of GAT 
polypeptides exemplified herein have been determined. For example, the Kcat, K M and 
Kcat/KM for the GAT polypeptide corresponding to SEQ ID NO:445 have been determined 
to be 322 min \ 0.5 mM and 660 mM" 1 * 1 , respectively, using the assay conditions 
described above. The Kc at , K M and K^/Km for the GAT polypeptide corresponding to 
SEQ ID NO:457 have been determined to be 118 min' 1 , 0.1 mM and 1184 mM" l min 1 , 
respectively, using the assay conditions described above. The K^t, Km and Kcat/K M for 
the GAT polypeptide corresponding to SEQ ID NO:300 have been determined to be 296 
min \ 0.65 mM and 456 mM* l min \ respectively, using the assay conditions described 
above. One of skill in the art can use these numbers to confirm that a GAT activity assay 
is generating kinetic parameters for a GAT suitable for comparison with the values given 
herein. For example, the conditions used to compare the activity of GATs should yield the 
same kinetic constants for SEQ ID NOS: 300, 445 and 457 (within normal experimental 
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variance) as those reported herein, if the conditions are going to be used to compare a test 
GAT with the GAT polypeptides exemplified herein. Kinetic parameters for a number of 
GAT polypeptide variants were determined according to this methodology and are 
provided in Tables 3, 4 and 5. 

5 



Table 3. GAT polypeptide kc a t values 



SEQ ID NO. 


Clone ID 




SEQ ID NO:263 


13 10F6 


48.6 


SEQ ID NO:264 


13 12G6 


52.1 | 


SEQ ID NO:265 


14 2A5 


280.8 


SEQ ID NO:266 


14 2C1 


133.4 


SEQ ID NO:267 


14 2F11 


136.9 


SEQ ID NO:268 


CHIMERA 


155.4 


SEQ ID NO:269 


10 12D7 


77.3 


SEQ ID NO:270 


10 15F4 


37.6 


SEQ ID NO:271 


10 17D1 


176.2 


SEQ ID NO:272 


10 17F6 


47.9 


SEQ ID NO:273 | 


10 18G9 I 


24 


SEQ ID NO:274 


10 1H3 


76.2 


SEQ ID NO:275 


10 20D10 


86.2 


SEQ ID NO:276 


10 23F2 


101.3 


SEQ ID NO:277 


10 2B8 


108.4 


SEQ ID NO:278 


10 2C7 


135 


SEQ ID NO:279 


10 3G5 


87.4 


SEQ ID NO:280 


10 4H7 


112 


SEQ ID NO:281 


10 6D11 


62.4 


SEQ ID NO:282 


10 8C6 


21 .7 


SEQ ID NO:283 


11C3 


2.8 


SEQ ID NO:284 


11G3 


15.6 


SEQ ID NO:285 


11H3 


1.2 


SEQ ID NO:286 


12 1F9 


80.4 


SEQ ID NO:287 


12 2G9 


151.4 


SEQ ID NO:288 


12 3F1 


44.1 


SEQ ID NO:289 


12 5C10 


89.6 


SEQ ID NO:290 


12 6A10 


54.7 


SEQ ID NO:291 


12 6D1 


49 


SEQ ID NO:292 


12 6F9 


89.1 


SEQ ID NO:293 


12 6H6 


90.5 


SEQ ID NO:294 


12 7D6 


53.9 


SEQ ID NO:295 


12 7G11 


234.5 


SEQ ID NO:296 


12F5 


3.1 


SEQ ID NO:297 


12G7 


2.3 


SEQ ID NO:298 


1 2H6 


9.3 


SEQ ID NO:299 


13 12G12 


36.1 


SEQ ID NO:300 


13 6D10 


296.5 


SEQ ID NO:301 


13 7A7 


117 


SEQ ID NO:302 


13 7B12 


68.9 


SEQ ID NO:303 


13 7C1 


48.1 


SEQ ID NO:304 


13 8G6 


33.7 


SEQ ID NO:305 


13 9F6 


59 


SEQ ID NO:306 


14 10C9 


127 


SEQ ID NO:307 


14 10H3 


105.2 


SEQ ID NO:308 


14 10H9 


127.2 
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SEQ ID NO:309 


14_11C2 


108.7 


SEQ ID NO:310 


14_12D8 


62.1 


SEQ ID N0:311 


14_12H6 


91.1 


SEQ ID NO:312 


14_2B6 


34.2 


SEQ ID NO:313 


14_2G11 


69.4 


SEQ ID NO:314 


14_3B2 


68.7 


SEQ ID NO:315 


144H8 


198.8 


SEQ ID NO:316 


14 6A8 


43.7 


SEQ ID NO:317 


14_6B10 


134.7 


SEQ ID NO:318 


14_6D4 


256 


SEQ ID NO:319 


14_7A11 


197.2 


SEQ ID NO:320 


14_7A1 


155.8 


SEQ ID NO:321 


147A9 


245.9 


SEQ ID NO:322 


14_7G1 


136.7 


SEQ ID NO:323 


14_7H9 


64.4 


SEQ ID NO:324 


14„8F7 


90.5 


SEQ ID NO:325 


15_10C2 


69.9 


SEQ ID NO:326 


15_10D6 


67.1 


SEQ ID NO:327 


15_1 1F9 


76.4 


SEQ ID NO:328 


15_11H3 


61.9 


SEQ ID NO:329 


15 12A8 


77.1 


SEQ ID NO:330 


15 12D6 


148.6 


SEQ ID NO:331 


15 12D8 


59.7 


SEQ ID NO:332 


15 12D9 


59.7 


SEQ ID NO:333 


15 3F10 


48.7 


SEQ ID NO:334 


15 3G11 


71.5 


SEQ ID NO:335 


15 4F11 


80.3 


SEQ ID NO:336 


15 4H3 


93.3 


SEQ ID NO:337 


15 6D3 


85.9 


SEQ ID NO:338 


15 6G11 


36.9 


SEQ ID NO:339 


15 9F6 


59.6 


SEQ ID NO:340 


15F5 


0.5 


SEQ ID NO:341 


16A1 


10.4 


SEQ ID NO:342 


16H3 


3.5 


SEQ ID NO:343 


17C12 


3.2 


SEQ ID NO:344 


18D6 


9.6 


SEQ ID NO:345 


19C6 


2.2 


SEQ ID NO:346 


19D5 


2.2 


SEQ ID NO:347 


20A12 


2.8 


SEQ ID NO:348 


20F2 


3.9 


SEQ ID NO:349 


2.10E+12 


1.1 


SEQ ID NO:350 


23H11 


7.1 


SEQ ID NO:351 


24C1 


1.7 


SEQ ID NO:352 


24C6 


2.7 


SEQ ID NO:353 


2.40E+08 


8.9 


SEQ ID NO:354 


2 8C3 


24.8 


SEQ ID NO:355 


2H3 


16.1 


SEQ ID NO:356 


30G8 


10.2 


SEQ ID NO:357 


3B 10C4 


OA ft 


SEQ ID NO:358 


3B 10G7 


19.6 


SEQ ID NO:359 


3B 12B1 


22.8 


SEQ ID NO:360 


3B 12D10 


5.4 


SEQ ID NO:361 


3B 2E5 


16.4 


SEQ ID NO:362 


3C 10H3 


33.9 


SEQ ID NO:363 


3C 12H10 


9.1 


SEQ ID NO:364 


3C 9H8 


11.7 


SEQ ID NO:365 


4A 1B11 


23.2 


SEQ ID NO:366 


4A 1C2 


20.4 
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SEQ ID NO:367 < 


4B_13E1 C 


37.2 


SEQ ID NO:368 ' 


*B_13G10 C 


$4.9 


SEQ ID NO:369 < 


1B_16E1 


17 


SEQ ID NO.370 


4B 17A1 


19.1 


SEQ ID NO:371 


4B 18F11 


14.6 


SEQ ID NO:372 


XB 19C8 


15.9 


SEQ ID NO:373 < 


4B 1G4 J 


3.7 


SEQ ID NO:374 < 


4B 21 C6 


11.8 


SEQ ID NO:375 


4B_2H7 


27 


SEQ ID NO:376 


4B_2H8 


38.3 


SEQ ID NO:377 


4B 6D8 


22.7 


SEQ ID NO:378 


4B 7E8 


20.5 


SEQ ID NO:379 


4C_8C9 


9 


SEQ ID NO:380 


4H1 


1-3 


SEQ ID NO:381 


6 14D10 


42.2 


SEQ ID NO:382 


6 15G7 


48.4 


SEQ ID NO:383 | 


6 16A5 


43.8 


SEQ ID NO:384 


6 16F5 


35.2 


SEQ ID NO:385 


6 17C5 


35.2 


SEQ ID NO:386 


6 18C7 


32.2 


SEQ ID NO:387 


6 18D7 


43 


SEQ ID NO:388 


6 19A10 


86.8 


SEQ ID NO:389 


6 19B6 


23.9 


SEQ ID NO:390 


6 19C3 


23.1 


SEQ ID NO:391 


6 19C8 


74.8 


SEQ ID NO:392 


6 20A7 


40.4 


SEQ ID NO:393 


6 20A9 


45.1 


SEQ ID NO:394 


6 20H5 


19.5 


SEQ ID NO:395 


6 21 F4 


24.3 


SEQ ID NO:396 


6 22C9 


47.4 


SEQ ID NO:397 


6 22D9 


43.9 


SEQ ID NO:398 


6 22H9 


17.4 


SEQ ID NO:399 


6 23H3 


43.9 


SEQ ID NO:400 


6 23H7 


46.2 


SEQ ID NO:401 


6 2H1 


26.6 


SEQ ID NO:402 


6 3D6 


41 .7 


SEQ ID NO:403 


6 3G3 


51.9' 


SEQ ID NO:404 


6 3H2 


57.2 


SEQ ID NO:405 


6 4A10 


55 


SEQ ID NO:406 


6 4B1 


27 


SEQ ID NO:407 


6 5D11 


15.2 


SEQ ID NO:408 


6 5F11 


40.1 


SEQ ID NO:409 


6 5G9 


35.8 


SEQ IDNO:410 


6 6D5 


55.3 


SEQIDNO:411 


6 7D1 


19.7 


SEQ ID NO:412 


6 8H3 


44.7 


SEQ ID NO:413 


6 9G11 


78.4 


SEQ ID NO:414 


6F1 


10.1 


SEQ ID NO:415 


*7 i C*A 
( l U4 


17.4 


SEQIDNO:416 


7 2A10 


14.5 


SEQ ID N0:417 


7 2A11 


46.8 


SEQ ID NO:418 


7 2D7 


54.9 


SEQ ID NO:419 


7 5C7 


44.7 


SEQ ID NO:420 


7 9C9 


65 


SEQ ID NO:421 


9 13F10 


34.7 


SEQ ID NO:422 


9 13F1 


31.6 


SEQ ID NO:423 


9 15D5 


27.6 


SEQ ID NO:424 


9 15D8 


107.3 
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SEQ ID NO:425 5 


3_15H3 I 


38.7 


SEQ ID NO:426 < 


9_18H2 : 


25 


SEQ ID NO:427 


9 20F12 


37.8 


SEQ ID NO:428 


9_21C8 


28.6 


SEQ ID NO:429 


9 22B1 


50.1 


SEQ ID NO:430 


9_23A10 


21 


SEQ ID NO:431 


9_24F6 


52.5 


SEQ ID NO:432 


9_4H10 


101.3 


SEQ ID NO:433 


9_4H8 


47.1 


SEQ ID NO:434 


9 8H1 


74.8 


SEQ ID NO:435 


9_9H7 


28 


SEQ ID NO:436 


9C6 


13 


SEQ ID NO:437 


9H11 


4 


SEQ ID NO:438 


0_4B10 


190 


SEQ ID NO:439 


0 5B11 


219 


SEQ ID NO:440 


0 5B3 


143 


SEQ ID NO:441 


0 5B4 


180 


SEQ ID NO:442 [ 


0 5B8 


143 


SEQ ID NO:443 


0 5C4 


205 


SEQ ID NO:444 


0 5D11 


224 


SEQ ID NO:445 


0 5D3 


322 


SEQ ID NO:446 


0 5D7 


244 


SEQ ID N0:447 


0 6B4 


252 


SEQ ID NO:448 


0 6D10 


111 


SEQ ID NO:449 


0 6D11 


212 


SEQ ID NO:450 


0 6F2 


175 


SEQ ID NO:451 


0 6H9 


228 


SEQ ID NO:452 


10 4C10 


69.6 


SEQ ID NO:453 


10 4D5 


82.72 


SEQ ID NO:454 


10 4F2 


231 .04 


SEQ ID NO:455 


10 4F9 


55.39 


SEQ ID NO:456 


10 4G5 


176.65 


SEQ ID NO:457 


10 4H4 


118.36 


SEQ ID NO:458 


11 3A11 


55.66 


SEQ ID NO:459 


11 3B1 


219.97 


SEQ ID NO:460 


11 3B5 


194.61 


SEQ ID NO:461 


11 3C12 


49.07 


SEQ ID NO:462 


11 3C3 


214.02 


SEQ ID NO:463 


11 3C6 


184.44 


SEQ ID NO:464 


11 3D6 


55.3 


SEQ ID NO:465 


1 1G12 


58.48 


SEQ ID NO:466 


1 1H1 


291 


SEQ ID NO:467 


1 1H2 


164 


SEQ ID NO:468 


1 1H5 


94 


SEQ ID NO:469 


1 2A12 


229 


SEQ ID NO:470 


1 2B6 


138 


SEQ ID NO:471 


1 2C4 


193 


SEQ ID NO:472 


1 2D2 


124 


SEQ ID NO:473 


1 2D4 




SEQ ID N0:474 


1 2F8 


161 


SEQ ID NO:475 


1 2H8 


141 


SEQ ID NO:476 


1 3A2 


181 


SEQ ID NO:477 


1 3D6 


226 


SEQ ID NO:478 


1 3F3 


167 


SEQ ID NO:479 


1 3H2 


128 


SEQ ID NO:480 


1_4C5 


254 


SEQ ID NO:481 


1_4D6 


137 


SEQ ID NO:482 


1 4H1 


236 
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SEQ ID NO:483 


1_5H5 : 


214 


SEQ ID NO:484 


1_6F12 * 


209 


SEQ ID NO:485 


1_6H6 : 


274 


SEQ ID NO:486 


3 11A10 


135.41 


SEQ ID NO:487 


3 14F6 


188.43 


SEQ ID NO:488 


3_15B2 


104.13 


SEQ ID N0:489 


3 6A10 


126.48 


SEQ ID NO:490 


3_6B1 


263.08 


SEQ ID NO:491 


3 7F9 


1 93.55 


SEQ ID NO:492 


3 8G11 


99.14 


SEQ ID NO:493 


4 1B10 


77.09 


SEQ ID NO:494 


5 2B3 


56.75 


SEQ ID NO:495 


5 2D9 


75.44 


SEQ ID NO:496 


5 2F10 


54.72 


SEQ ID NO:497 ! 


6 1A11 


45.54 


SEQ ID NO:498 


6 1D5 


42.92 


SEQ ID NO:499 


6 1F11 j 


105.76 


SEQ ID NO:500 


6 1F1 


69.81 


SEQ ID NO:501 


6 1H10 


17.01 


SEQ ID NO:502 


6 1H4 


85.91 


SEQ ID NO:503 


8 1F8 


82.88 


SEQ ID NO:504 


8 1G2 ! 


67.47 


SEQ ID NO:505 


8 1G3 


108.9 


SEQ ID NO:506 


8 1H7 


101.24 


SEQ ID NO:507 


8 1H9 


78.39 


OtU IU INVJ.OUO 


GAT1 21F12 


5.4 


otvJ iu iNw.ouy 


GAT1 24G3 


4.9 


ceo in Mn-mn 

OtU IU INVJ.OIU 


GAT1 29G1 


6.2 


Qcn in wn*m 1 

OlU IU INU.O I I 


GAT1 32G1 


4.R 




GAT2 15G8 


4.5 




GAT2 19H8 


4.1 


ocn in 

OC\J IU iNU.Ol*!" 


GAT2 21 F1 




Ta 


ble 4. GAT polypeptide (glyphosate) Km values 


SEQ ID NO. 


Clone ID 


MmM) 


SEQ ID NO:263 


13 10F6 


1.3 


SEQ ID NO:264 


13 12G6 


1.2 


SEQ ID NO:265 


14 2A5 


1.6 


SEQ ID NO:266 


14 2C1 


3.1 


SEQ ID NO:267 


14 2F11 


1.7 


SEQ ID NO:268 


CHIMERA 


1.3 


SEQ ID NO:269 


10 12D7 


1.8 


SEQ ID NO:270 


10 15F4 


1 


SEQ ID NO:271 


10 17D1 


99 


SEQ ID NO:272 


10 17F6 


1.4 


SEQ ID NO:273 


10 18G9 


1.2 


SEQ ID NO:274 


10 1H3 


1.9 


SEQ ID NO:275 


10 20D10 


1.6 


SEQ ID NO:276 


10 23F2 


0.9 


SEQ ID NO:277 


10 2B8 


1.1 


SEQ ID NO:278 


10 2C7 


1.4 


SEQ ID NO:279 


10 3G5 


2 


SEQ ID NO:280 


10 4H7 


1.7 


SEQ ID NO:281 


10 6D11 


1.2 


SEQ ID NO:282 


10 8C6 


0.7 


SEQ ID NO:283 


11C3 


3.1 
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SEQ ID NO:284 


1 1G3 


1 *7 

\.f 


SEQ ID NO:285 


11H3 


1 A 
1 .4 


SEQ ID NO:286 


1 2_1 r9 * 




SEQ ID NO.287 


1 2_2G9 


1 -O 


SEQ JD NO.288 


12__3P1 


j.y 


SEQ ID NO:289 


1 2_5C1 0 


1 .5 


SEQ ID NO:290 


12 6A10 


1 .1 


SEQ ID NO:291 


12 6D1 


A O 


SEQ ID NO:292 


12 6F9 


1.9 


SEQ ID NO:293 


12 6H6 


•i c 

1 .b 


SEQ ID NO:294 


12 7D6 


-I A 

1 .4 


SEQ ID NO:295 


12 7G11 


o 
c. 


SEQ ID NO:296 


12F5 i 


^ a 
1 -8 


SEQ ID NO:297 


12G7 


Q "7 
6.1 


SEQ ID NO:298 


1 2H6 


0.9 


SEQ ID NO:299 


13 12G12 


0.69 


SEQ ID NO:300 


13 6D10 


0.65 


SEQ ID NO:301 


13 7A7 


O.O 


SEQ ID NO:302 


13 7B12 


A\ "7 


SEQ ID NO.303 


13 7C1 


a r~ 

i .5 


SEQ ID NO:304 


13^8G6 


0.61 


SEQ ID NO:305 


13 9F6 


■i o 


SEQ ID NO:306 


14 10C9 


0.9 


SEQ ID NQ:307 


14 10H3 


O.D 


SEQ ID NO:308 


14 10H9 


1.1 


SEQ ID NO:309 


14 11C2 


At 
1 


SEQ ID NO:310 


14 12D8 


1 


SEQ ID NO:311 


14 12H6 J 


0.9 


SEQ ID NO:312 


14 2B6 


O.OO 


SEQ ID NO:313 


14 2G11 


\ .4 


SEQ ID NO:314 


14 3B2 


u.oo 


SEQ ID NO:315 


14 4H8 


o 


SEQ ID NO:316 


14 6A8 


U./o 


SEQ ID NO:317 


14 6B10 


•* A 

1 .4 


SEQ ID NO:318 


14 6D4 


■f 
1 


SEQ ID NO:319 


14 7A11 


O "7 
O./ 


SEQ ID NO:320 


14 7A1 


1 -b 


SEQ ID NO:321 


14 7A9 




SEQ ID NO:322 


14 7G1 


U.bb 


SEQ ID NO:323 


14 7H9 


1 .o 


SEQ ID NO:324 


14 8F7 


1 .0 


SEQ ID NO:325 


15 10C2 


U.O 


SEQ ID NO:326 


15 10D6 


1 


SEQ ID NO:327 


15 11F9 




SEQ ID NO:328 


15 11H3 


-4 
1 


SEQ ID NO:329 


15 12A8 


l .b 


SEQ ID NO:330 


15 12D6 


n "7/1 
U./4 


SEQ ID NO:331 


15 12D8 




O tvJ IU INU.OO^ 


1^ 12DQ * 

1 \J 1 C-\—/<J 


1.4 


SEQ ID NO:333 


15 3F10 


0.9 


SEQ ID NO:334 


15 3G11 


1.2 


SEQ ID NO:335 


15 4F11 


0,9 


SEQ ID NO:336 


15 4H3 


1 


SEQ ID NO:337 


15 6D3 


1.4 


SEQ ID NO:338 


15 6G11 


0.9 


SEQ ID NO:339 


15 9F6 


1.1 


SEQ ID NO:340 


15F5 


2.9 


SEQ ID NO:341 


16A1 


2.9 
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SEQ ID NO:342 


16H3 J 


2.9 


SEQ ID NO:343 


17C12 


1.4 


SEQ ID NO:344 


18D6 


1.2 


SEQ ID NO:345 


19C6 


1.1 


SEQ ID NO:346 


19D5 


1.7 


SEQ ID NO:347 


20A12 


1.1 


SEQ ID NO:348 


20F2 


1.9 


SEQ ID NO:349 


2.10E+12 


0.7 


SEQ ID NO:350 


23H11 


2.2 


SEQ ID NO:351 


24C1 x 


0.9 


SEQ ID NO:352 


24C6 


1.3 


SEQ ID NO:353 


2.40E+08 


0.9 


SEQ ID NO:354 


2 8C3 


1.5 


SEQ ID NO:355 


2H3 


0.9 


SEQ ID NO:356 


30G8 


1.6 j 


SEQ ID NO:357 


3B 10C4 


1.6 


SEQ ID NO:358 


3B 10G7 


1 


SEQ ID NO:359 


3B 12B1 


1.2 


SEQ ID NO:360 


3B„12D10 


0.9 


SEQ ID NO:361 


3B^2E5 


1 .3 


SEQ ID NO:362 


3C 10H3 


1.1 


SEQ ID NO:363 


3C 12H10 


1.2 


SEQ ID NO:364 


3C 9H8 


1 


SEQ ID NO:365 


4A 1B11 


1.6 


SEQ ID NO:366 


4A 1C2 


1.2 


SEQ ID NO:367 


4B 13E1 


2 ! 


SEQ ID NO:368 


4B 13G10 


7.6 


SEQ ID NO:369 


4B 16E1 


1 


SEQ ID NO:370 


4B 17A1 


1.1 


SEQ ID NO:371 


4B 18F11 


1.7 


SEQ ID NO:372 


4B 19C8 


1.2 


SEQ ID NO:373 


4B 1G4 


1 


SEQ ID NO:374 


4B 21 C6 


0.8 


SEQ ID NO:375 


4B 2H7 


6.2 


SEQ ID NO:376 


4B 2H8 


1.2 


SEQ ID NO:377 


4B 6D8 


1.5 


SEQ ID NO:378 


4B 7E8 


1.2 


SEQ ID NO:379 


4C 8C9 


0.6 


SEQ ID NO:380 


4H1 


1.4 


SEQ ID NO:381 


6 14D10 


1.5 


SEQ ID NO:382 


6 15G7 


1.3 


SEQ ID NO:383 


6 16A5 


1.1 


SEQ ID NO:384 


6 16F5 


1 


SEQ ID NO:385 


6 17C5 


1.3 


SEQ ID NO:386 


6 18C7 


1.2 


SEQ ID NO:387 


6 18D7 


1.2 


SEQ ID NO:388 


6 19A10 


1.9 


SEQ ID NO:389 


6 19B6 


0.7 


SEQ ID NO:390 


6 19C3 


1 A 


SEQ ID NO:391 


6 19C8 


2 


SEQ ID NO:392 


6 20A7 


1 


SEQ ID NO:393 


6 20A9 


1.3 


SEQ ID NO:394 


6 20H5 


0.8 


SEQ ID NO:395 


6 21 F4 


0.7 


SEQ ID NO:396 


6 22C9 


3.2 


SEQ ID NO:397 


6 22D9 


1.3 


SEQ ID NO:398 


6 22H9 


1.1 


SEQ ID NO:399 


6 23H3 


1.1 
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SEQ ID NO:400 ( 


3 23H7 


1 o 1 

1.2 


SEQ ID N0:401 ( 


3 2H1 < 


3.9 


SEQ ID NO:402 ( 


3 3D6 


1 


SEQ ID NO:403 ( 


3 3G3 


1 


SEQ ID NO:404 ( 


3 3H2 


1 


SEQ ID NO:405 ( 


3 4A10 


1.1 


SEQ ID NO:406 ( 


3 4B1 


1 


SEQ ID NO:407 I 


3 5D11 


1 I 


SEQ ID NO:408 I 


3 5F11 


1.9 


SEQ ID NO:409 


5 5G9 


1 .4 


SEQIDNO:410 


B 6D5 


1 


SEQ ID NO:411 


6J7D1 


0.5 


SEQ ID N0:412 


6 8H3 


1 


SEQ ID NO:413 


6 9G11 i 


1.3 


SEQ ID N0:414 


6F1 


1.8 


SEQ ID N0:415 


7 1C4 


1 .1 


SEQ IDNO:416 


7 2A10 


0.8 


SEQ IDN0:417 


7 2A11 [ 


1.1 


SEQ ID N0:418 


7 2D7 


1.1 


SEQ ID N0:419 


7 5C7 


1 


SEQ ID NO:420 


7 9C9 


1 ... 


SEQ ID NO:421 


9 13F10 


0.7 


SEQ ID NO:422 


9 13F1 


1.1 


SEQ ID NO:423 


9 15D5 


1-2 


SEQ ID N0:424 


9 15D8 


1.1 


SEQ ID NO:425 


9 15H3 


1.9 


SEQ ID NO:426 


9 18H2 


1.1 


SEQ ID NO:427 


9 20F12 


1 


SEQ ID NO:428 


9 21C8 


1.2 


SEQ ID NO:429 


9 22B1 


1.4 


SEQ ID N0:430 


9 23A10 


1 


SEQ ID NO:431 


9 24F6 


0.9 


SEQ ID N0:432 


9 4H10 


1.5 


SEQ ID NO:433 


9 4H8 


0.6 


SEQ ID NO:434 


9 8H1 


1.7 


SEQ ID NO:435 


9 9H7 


0.7 


SEQ ID NO:436 


9C6 


2.5 


SEQ ID NO:437 


9H11 


2.3 


SEQ ID NO:438 


0 4B10 


0.68 


SEQ ID NO:439 


0 5B11 


0.54 


SEQ ID NO:440 


0 5B3 


0.39 


SEQ ID NO:441 


0 5B4 


0.6 


SEQ ID N0:442 


0 5B8 


0.27 


SEQ ID N0:443 


0 5C4 


0.67 


SEQ ID N0:444 


0 5D11 


0.67 


SEQ ID N0:445 


0 5D3 


0.5 


SEQ ID NO:446 


0 5D7 


1 .1 


SEQ ID N0:447 


0 6B4 


0.8 




U OU 1 u 


0.1 


SEQ ID NO:449 


0 6D11 


0.44 


SEQ ID NO:450 


0 6F2 


0.34 


SEQ ID NO:451 


0 6H9 


0.47 


SEQ ID NO:452 


10 4C10 


0.1 


SEQ ID NO:453 


10 4D5 


0.1 


SEQ ID NO:454 


10 4F2 


0.2 


SEQ ID NO:455 


10 4F9 


0.1 


SEQ ID NO:456 


10 4G5 


0.58 


SEQ ID NO:457 


10 4H4 


0.1 
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SEQ ID NO:458 


11_3A11 


0.1 


SEQ ID NO:459 


11_3B1 


0.63 


SEQ ID NO:460 


11_3B5 


0.26 


SEQ ID NO:461 


11_3C12 


0.1 


SEQ ID NO:462 


11J3C3 


0.22 


SEQ ID NO:463 


11_3C6 


0.21 


SEQ ID NO:464 


11_3D6 


0.1 


SEQ ID NO:465 


1_1G12 


0.1 


SEQ ID NO:466 


1_1H1 


1.8 


SEQ ID NO:467 


1_1H2 


0.44 


SEQ ID NO:468 


1 1H5 


1.5 


SEQ ID NO:469 


1_2A12 


1.3 


SEQ ID NO:470 


1_2B6 


0.58 


SEQ ID NO:471 


1_2C4 


0.8 


SEQ ID NO:472 


1 2D2 


1.2 


SEQ ID NO:473 


1 2D4 


1.2 


SEQ ID NO:474 


1_2F8 


1.9 


SEQ ID NO:475 


1_2H8 


0.48 


SEQ ID NO:476 


1„3A2 


0.8 


SEQ ID NO:477 


1 3D6 


3.5 


SEQ ID NO:478 


1_3F3 


1.5 


SEQ ID NO:479 


1_3H2 


0.7 


SEQ ID NO:480 


1_4C5 


0.93 I 


SEQ ID NO:481 


1_4D6 


1.4 


SEQ ID NO:482 


1„4H1 


1.2 


SEQ ID NO:483 


1 5H5 


0.51 


SEQ ID NO:484 


1 6F12 


14.7 


SEQ ID NO:485 


1__6H6 


1.05 


SEQ ID NO:486 


3_11A10 


0.17 


SEQ ID NO:487 


3_14F6 


0.25 


SEQ ID NO:488 


3_15B2 


0.1 


SEQ ID NO:489 


3_6A10 


0.66 


SEQ ID NO:490 


3 6B1 


0.43 


SEQ ID NO:491 


3_7F9 


0.29 


SEQ ID NO:492 


3_8G11 


0.1 


SEQ ID NO:493 


4 1B10 


0.1 


SEQ ID NO:494 


5 2B3 


0.1 


SEQ ID NO:495 


5 2D9 


0.1 


SEQ ID NO:496 


5_2F10 


0.1 


SEQ ID NO:497 


6„1A11 


0.1 


SEQ ID NO:498 


6_1D5 


0.1 


SEQ ID NO:499 


6_1 F11 


0.1 


SEQ ID NO:500 


6_1 F1 


0.1 


SEQ ID NO:501 


6_1H10 


0.1 


SEQ ID NO:502 


6_1H4 


0.1 


SEQ ID NO:503 


8_1 F8 


0.1 


SEQ ID NO:504 


8_1G2 


0.1 


SEQ ID NO:505 


8_1G3 


0.1 


SEQ ID NO:506 


8 1H7 


0.1 


SEQ ID NO:507 


8_1H9 


0.1 


SEQ ID NO:508 


GAT1_21 F12 


4.6 


SEQ ID NO:509 


GAT1_24G3 


3.8 


SEQ ID N0:510 


GAT1_29G1 


4 


SEQ ID N0:511 


GAT1 32G1 


3.3 


SEQ IDNO:512 


GAT2_15G8 


2.8 


SEQ IDNO:513 


GAT2 19H8 


2.8 


SEQ IDNO:514 


GAT2_21 F1 


3 
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Table 5. GAT polypeptide kcJ K M values 



SEQ ID NO 


Clone ID 


Kn«f/K M (mM M mln") 


SEQ ID N0263 


13 10F6 


37.4 


SFO ID NO/PR4 


13 12G6 


43.4 


SFO ID NO-PRS 


14 PAS 


175.5 




14 phi 


43 


ocn in NO,'PR7 


14 PF11 


80 fi 


£FO ID MD'Pfifl 




1 19.6 


SFO ID NIO'PRQ. 


10 19D7 


43 


ccn in no*p7o 


10 1SF4 


37.6 


SFO ID MO"P71 


10 17D1 


80.1 


qcn in NO-979 


10 17FR 


34 P 


OCU IU INV.C / O 


10 1ftf5Q 


20 




10 11-13 
i \f i no 


40.1 | 


ocn m Nin*97R 

OtU i L/ INU.c f O 


10 pomo 


S3 9 




m 93F9 


HP K 

1 It-.xJ 


CeO m Mfl-977 
OtU IU NvJ.d./ f 


10 9Rft 
I U cDO 




Qcn in MO'97A 
OlU IU INU.c/O 


in 9P7 
I U c.\-t 1 


Qfi 4 


otU IU IMU.t/y 


1 0 

1 U OVJIO 


43 7 

HO. / 


ccn in MO'Qfln 

OCU IU INU.^OU 


10 /LW7 
I V Hit 1 


RR Q 


cfo in mO'Pri 

OtU IU INU.^O 1 


10 RH11 


52 




10 ROfi 


31 

O 1 ] 


epn m NO*9R3 

OtU IU Vh\J.£.00 


I 1 wo 


0 9 


epn m MO-9ftzL 

OtU IU V*\J.c.O t * 


1 1fn3 


8 9 


opn in MO-9ft^ 


1 1 H3 
i i no 


0 9 


ceo m mo-oqc 

OtU IU IMU.^OO 


19 1 FQ 
i ^ i no 


PR R 


OtU IU IMU.^O/ 


1 9 Of^Q 


101 

Iv I 


ceo in Mri'ORti 

OtU IU IMU.^OO 


1 9 **F1 


4Q 


CCO in MH'OQQ 


l ^ O w I u 


SQ 7 


ceo in MH'OQn 
otu iu invj.^ou 


19 RA10 
I ^ On i \j 


49.7 


ceo in MO-PQ1 


19 RD1 


40.8 


CeO in MO«9Q9 


1 9 RFQ 


46.9 


ceo in MO"9Q^ 


19 RHR 


56.5 


ceo in MO'pQzi 
ocu iu iMU.^y*t 


19 7DR 


38.5 


ceo in mh-qqc; 


19 70i1 


117.2 


ceo in Mft'OOA 
otu iu Nvj.^yb 


1 opc 


1 7 


ceo m MH'007 
otu iu rsiu.^y/ 


1 9f^7 


0 fi 


ceo in MO-OQQ 
otu iu NU.^yo 




10 4 


ceo in mo-oqq 

otu iu Nu.^yy 


1 o 1 <i- o I t 


SP 4 


SEQ ID NO'300 


13 6D10 


456.1 


SEQ ID NO:301 


13 7A7 


234 


SEQ ID NO:302 


13 7B12 


40.5 


SEQ ID NO:303 


13 7C1 


32.1 


SEQ ID NO:304 


13 8G6 


55.2 


SEQ ID NO:305 


13 9F6 


45.3 


SEQ ID NO:306 


14_10C9 


141.1 


SEQ ID NO:307 


14_10H3 


175.3 


SEQ ID NO:308 


14 10H9 


115.6 


SEQ ID NO:309 


14 11C2 


108.7 


SEQ ID NO:310 


14 12D8 


62.1 


SEQ ID NO:311 


14 12H6 


101.3 


SEQ ID NO:312 


14 2B6 


54.3 


SEQ ID NO:313 


14 2G11 


49.6 


SEQ ID NO:314 


14 3B2 


80.9 
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SEQ ID NO:315 


14 4H8 


99.4 


SEQ ID NO:316 


14 6A8 


56 


SEQ ID NO:317 


14_6B10 


96.2 


SEQ ID NO:318 


14_6D4 


256 i 


SEQIDNO:319 


14_7A11 


53.3 


SEQ ID NO:320 


14 7A1 


97.4 


SEQ ID NO:321 


14 7A9 


76.9 


SEQ ID NO:322 


14 7G1 


207.1 


SEQ ID NO:323 


14_7H9 


49.5 


SEQ ID NO:324 


14_8F7 


50.3 


SEQ ID NO:325 


15 10C2 


87.3 


SEQ ID NO:326 


15 10D6 


67.1 


SEQ ID NO:327 


15_11F9 


76.4 


SEQ ID NO:328 


15_11H3 


61.9 


SEQ ID NO:329 


15 12A8 


48.2 


SEQ ID NO:330 


15 12D6 


200.8 


SEQ ID NO:331 


15 12D8 


45.9 ! 


SEQ ID NO:332 


15 12D9 


42.6 


SEQ ID NO:333 


15 3F10 


54.1 


SEQ ID NO:334 


15 3G11 


59.6 


SEQ ID NO:335 


15 4F11 


89.2 


SEQ ID NO:336 


15 4H3 


93.3 


SEQ ID NO:337 


15 6D3 


61 .3 


SEQ ID NO:338 


15 6G11 


41 


SEQ ID NO:339 


15 9F6 


54.2 


SEQ ID NO:340 


15F5 


0.2 


SEQ ID NO:341 


16A1 


3.6 


SEQ ID NO:342 


16H3 


1.2 


SEQ ID NO:343 


17C12 


2.3 


SEQ ID NO:344 


18D6 


8 


SEQ ID NO:345 


19C6 


2 


SEQ ID NO:346 


19D5 


1.3 


SEQ ID NO:347 


20A12 


2.5 


SEQ ID NO:348 


20F2 


2 N 


SEQ ID NO:349 


2.10E+12 


1.5 


SEQ ID NO:350 


23H11 


3.2 


SEQ ID NO:351 


24C1 


1.8 


SEQ ID NO:352 


24C6 


2.1 


SEQ ID NO:353 


2.40E+08 


9.8 


SEQ ID NO:354 


2 8C3 


16.6 


SEQ ID NO:355 


2H3 


17.7 


SEQ ID NO:356 


30G8 


6.4 


SEQ ID NO:357 


3B 10C4 


15.5 


SEQ ID NO:358 


3B 10G7 


19.6 


SEQ ID NO:359 


3B 12B1 


19 


SEQ ID NO:360 


3B 12D10 


6 


SEQ ID NO:361 


3B 2E5 


12.6 


SEQ ID NO:362 


3C 10H3 


30.8 


SEQ ID NOlODo 


unlU 


7 R 


SEQ ID NO:364 


3C 9H8 


11.7 


SEQ ID NO:365 


4A 1B11 


15 


SEQ ID NO:366 


4A 1C2 


17 


SEQ ID NO:367 


4B 13E1 


18.6 


SEQ ID NO:368 


4B 13G10 


4.6 


SEQ ID NO:369 


4B 16E1 


17 


SEQ ID NO:370 


4B 17A1 


17.4 


SEQ ID NO:371 


4B 18F11 


8.6 


SEQ ID NO:372 


4B 19C8 


13.2 
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SEQ ID NO:373 


4B_1 G4 


3.7 


SEQ ID NO:374 


4B_21C6 


14.8 


SEQ ID NO:375 


4B_2H7 


4.4 


SEQ ID NO:376 


4B_2H8 


31 .9 


SEQ ID NO:377 


4B_6D8 


15.2 


SEQ ID NO:378 


4B_7E8 


17.1 


SEQ ID NO:379 


4C_8C9 


15.1 


SEQ ID NO:380 


4H1 


0.9 


SEQ ID NO:381 


6__14D10 


28.2 


SEQ ID NO:382 


6 15G7 


37.3 


SEQ ID NO:383 


6 16A5 


39.8 


SEQ ID NO:384 


6_16F5 


35.2 


SEQ ID NO:385 


6_17C5 


27.1 


SEQ ID NO:386 


6_1 8C7 


26.8 


SEQ ID NO:387 


6_18D7 


35.8 


SEQ ID NO:388 


6_19A10 


45.7 


SEQ ID NO:389 


6_19B6 


34.2 


SEQ ID NO:390 


6_19C3 


16.5 


SEQ ID NO:391 


6_19C8 


37.4 


SEQ ID NO:392 


6_20A7 


40.4 


SEQ ID NO:393 


6_20A9 


34.7 


SEQ ID NO:394 


6_20H5 


24.3 


SEQ ID NO:395 


6_21F4 


34.7 


SEQ ID NO:396 


6_22C9 


14.8 


SEQ ID NO:397 


6__22D9 


33.8 


SEQ ID NO:398 


6_22H9 


15.9 


SEQ ID NO:399 


6_23H3 


39.9 


SEQ ID NO:400 


6_23H7 


38.5 


SEQ ID NO:401 


6_2H1 


29.5 


SEQ ID NO:402 


6 3D6 


41 .7 


SEQ ID NO:403 


6_3G3 


51 .9 


SEQ ID NO:404 


6 3H2 


57.2 


SEQ ID NO:405 


6_4A10 


50 


SEQ ID NO:406 


6„4B1 


27 


SEQ ID NO:407 


6_5D1 1 


15.2 


SEQ ID NO:408 


6_5F1 1 


21.1 


SEQ ID NO:409 


6_5G9 


25.6 


SEQ ID NO:410 


6_6D5 


55.3 


SEQ ID N0:41 1 


6_7D1 


39.5 


SEQ IDNO:412 


6_8H3 


44.7 


SEQ ID NO:413 


6_9G1 1 


60.3 


SEQIDN0:414 


6F1 


5.6 


SEQ ID NO:415 


7_1C4 


15.9 


SEQ ID NO:416 


7_2A10 


18.2 


SEQ ID NO:417 


7 2A11 


42.6 


SEQ ID NO:418 


7 2D7 


49.9 


SEQ ID NO:419 


7 5C7 


44.7 


SEQ ID N0:420 


7_9C9 


65 


SEQ ID NO:421 


9 13F10 


4y.o 


SEQ ID NO:422 


9_13F1 


28.7 


SEQ ID NO:423 


9 15D5 


23 


SEQ ID N0:424 


9 15D8 


97.6 


SEQ ID NO:425 


9 15H3 


36.2 


SEQ ID NO:426 


9 18H2 


22.7 


SEQ ID NO:427 


9_20F12 


37.8 


SEQ ID NO:428 


9 21 C8 


23.8 


SEQ ID N0:429 


9 22B1 


35.8 


SEQ ID N0:430 


9 23A10 


21 
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SEQ ID NO:431 


9_24F6 


58.3 


SEQ ID NO:432 


9_4H10 


67.5 


SEQ ID NO:433 


9_4H8 


78.5 


SEQ ID NO:434 


9_8H1 


44 


SEQ ID NO:435 


9_9H7 


40 


SEQ ID NO:436 


9C6 


5.1 


SEQ ID NO:437 


9H11 


1.7 


SEQ ID NO:438 


0_4B10 


279 


SEQ ID NO:439 


0_5B1 1 


406 


SEQ ID NO:440 


0_5B3 


367 


SEQ ID NO:441 


0 5B4 


301 


SEQ ID NO:442 


0 5B8 


522 


SEQ ID NO:443 


0_5C4 


306 


SEQ ID NO:444 


0_5D11 


334 


SEQ ID NO:445 


0 5D3 


660 


SEQ ID NO:446 


0 5D7 


222 


SEQ ID NO:447 


0 6B4 


315 


SEQ ID NO:448 


0_6D10 


1177 


SEQ ID NO:449 


0 6D11 


481 


SEQ ID NO:450 


0 6F2 


516 


SEQ ID NO:451 


0 6H9 


486 


SEQ ID NO:452 


10 4C10 


695.98 


SEQ ID NO:453 


10 4D5 


827.16 


SEQ ID NO:454 


10 4F2 


1155.19 


SEQ ID NO:455 


10_4F9 


553.93 


SEQ ID NO:456 


10 4G5 


304.57 


SEQ ID NO:457 


10 4H4 


1183.6 


SEQ ID NO:458 


11_3A11 


556.62 


SEQ ID NO:459 


11 3B1 


349.17 


SEQ ID NO:460 


11 3B5 


748.49 


SEQ ID NO:461 


11 3C12 


490.67 


SEQ ID NO:462 


11 3C3 


972.81 


SEQ ID NO:463 


11 3C6 


878.27 


SEQ ID NO:464 


11 3D6 


553.01 


SEQ ID NO:465 


1 1G12 


584.79 


SEQ ID NO:466 


1 1H1 


162 


SEQ ID NO:467 


1 1H2 


366 


SEQIDNO:468 


1 1H5 


63 


SEQ ID NO:469 


1 2A12 


176 


SEQ ID NO:470 


1 2B6 


239 J 


SEQ ID NO:471 


1 2C4 


242 


SEQ ID NO:472 


1 2D2 


104 


SEQ ID NO:473 


1 2D4 


152 


SEQ ID NO:474 


1 2F8 


85 


SEQ ID NO:475 


1 2H8 


294 


SEQ ID NO:476 


1 3A2 


227 


SEQ ID NO:477 


1 3D6 


64 


SEQ ID NO:478 


1 3F3 


112 


SEQ ID NO:479 


1 3H2 


I oo 


SEQ ID NO:480 


1 4C5 


273 


SEQ ID NO:481 


1 4D6 


98 


SEQ ID NO:482 


1 4H1 


196 


SEQ ID NO:483 


1 5H5 


419 


SEQ ID NO:484 


1 6F12 


14 


SEQ ID NO:485 


1 6H6 


259 


SEQ ID NO:486 


3 11A10 


796.55 


SEQ ID NO:487 


3 14F6 


753.73 


SEQ ID NO:488 


3 15B2 


1041.32 
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SEQ ID NO:489 


3_6A1 0 


191.64 


SEQ ID NO:490 


3_6B1 


611.81 


SEQ ID NO:491 


3_7F9 


667.4 


SEQ ID NO:492 


3_8G11 


991 .44 


SEQ ID NO:493 


4_1 B10 


770.91 


SEQ ID NO:494 


5 2B3 


567.5 


SEQ ID NO:495 


5 2D9 


754.36 


SEQ ID NO:496 


5_2F10 


547.22 


SEQ ID NO:497 


6_1A11 


455.41 


SEQ ID NO:498 


6 1D5 


429.16 


SEQ ID NO:499 | 


6_1 F1 1 


1057.6 


SEQ ID NO:500 


6_1F1 


698.15 


SEQ ID NO:501 


6 1H10 


170.11 


SEQ ID NO:502 


6_1H4 


859.12 


SEQ ID NO:503 


8 1F8 


828.78 


SEQ ID NO:504 


8 1G2 


674.73 


SEQ ID NO:505 


o Ibo 


1 ftftQ 0*7 

i uoo.y / 


SEQ ID NO:506 


8 1H7 


1012.4 


SEQ ID NO:507 


8__1 H9 


783.89 


SEQ ID NO:508 


GAT1 21F12 


1.2 I 


SEQ ID NO:509 


GAT1 24G3 


1.3 


SEQ ID NO:510 


GAT1 29G1 


1.5 


SEQ1DNO:511 


GAT1 32G1 


1.4 


SEQ ID NO:512 


GAT2 15G8 


1.6 


SEQ IDNO:513 


GAT2 19H8 


1.5 


SEQ ID NO:514 


GAT2 21 F1 


1.4 



Km for AcCoA is measured using the mass spectrometry method with 
repeated sampling during the reaction. Acetyl-coenzyme A and glyphosate (ammonium 
salts) are placed as 50-fold-concentrated stock solutions into a well of a mass spectrometry 
5 sample plate. Reactions are initiated with the addition of enzyme appropriately diluted in 
a volatile buffer such as morpholine acetate or ammonium carbonate, pH 6.8 or 7.7. The 
sample is repeatedly injected into the instrument and initial rates are calculated from plots 
of retention time and peak area. K M is calculated as for glyphosate. 

10 EXAMPLE 8: SEUBCTION OF TRANSFORMED E. CQLI 

An evolved gat gene (a chimera with a native B. licheniformis ribosome 
binding site (AACTGAAGGAGGAATCTC; SEQ ID NO:515) attached directly to the 5' 
end of the GAT coding sequence) was cloned into the expression vector pQE80 (Qiagen) 
between the EcoRI and Hindm sites, resulting in the plasmid pMAXY2190 (Figure 11). 

15 This eliminated the His tag domain from the plasmid and retained the B-lactamase gene 
conferring resistance to the antibiotics ampicillin and carbenicillin. pMAXY2190 was 
electroporated (BioRad Gene Pulser) into XL1 Blue (Stratagene) E. coli cells. The cells 
were suspended in SOC rich medium and allowed to recover for one hour. The cells were 
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then gently pelleted, washed one time with M9 minimal media lacking aromatic amino 
acids (12.8 g/L Na2HP04.7 H20, 3.0 g/L KH2P04, 0.5 g/L NaCl, 1.0 g/L NH4C1, 0.4% 
glucose, 2 mM MgS04, 0.1 mM CaC12, 10 mg/L thiamine, 10 mg/L proline, 30 mg/L 
carbenicillin), and resuspended in 20 ml of the same M9 medium. After overnight growth 
at 37°C at 250 rpm, equal volumes of cells were plated on either M9 medium or M9 plus 1 
mM glyphosate medium. pQE80/vector with no gat gene was similarly introduced into E. 
coli cells and plated for single colonies for comparison. The results are summarized in 
Table 6 and clearly demonstrate that GAT activity allows selection and growth of 
transformed E. coli cells with less than 1% background. Note that no IPTG induction was 
necessary for sufficient GAT activity to allow growth of transformed cells. 
Transformation was verified by re-isolation of pMAXY2190 from the E. coli cells grown 
in the presence of glyphosate. 



Table 6. Glyphosate selection of pMAXY2190 in E. coli 



Number of colonies 


Plasmid 


M9 - glyphosate 


M9 + 1 mM glyphosate 


PMAXY2190 


568 


512 


PQE80 


324 


3 



EXAMPLE 9: SELECTION OF TRANSFORMED PLANT CELLS 

Agrobacterium-mediated transformation of plant cells occurs at low 
efficiencies. To allow propagation of transformed cells while inhibiting proliferation of 
non-transformed cells, a selectable marker is needed. Antibiotic markers for kanamycin 
and hygromycin and the herbicide modifying gene bar, which detoxifies the herbicidal 
compound phosphinothricin, are examples of selectable markers used in plants (Methods 
in Molecular Biology, 1995, 49:9-18). Here we demonstrate that GAT activity serves as 
an efficient selectable marker for plant transformation. An evolved gat gene (0_5B8) was 
cloned between a plant promoter (enhanced strawberry vein banded virus) and a 
ubiquinone terminator and introduced into the T-DNA region of the binary vector 
pMAXY3793 suitable for transformation of plant cells via Agrobacterium tumefaciens 
EHA105 as shown in Figure 12. A screenable GUS marker was present in the T-DNA to 
allow confirmation of transformation. Transgenic tobacco shoots were generated using 
glyphosate as the only selecting agent. 

Axillary buds of Nicotiana tabacum L. Xanthi were subcultured on half- 
strength MS medium with sucrose (1.5 %) and Gelrite (0.3 %) under 16-h light (35-42 
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jjEinsteins m' 2 s"\ cool white fluorescent lamps) at 24 °C every 2-3 weeks. Young leaves 
were excised from plants after 2-3 weeks subculture and were cut into 3x3 mm segments. 
A. tumefaciens EHA105 was inoculated into LB medium and grown overnight to a density 
of A600= 1,0. Cells were pelleted at 4,000 rpm for 5 minutes and resuspended in 3 
5 volumes of liquid co-cultivation medium composed of Murashige and Skoog (MS) 
medium (pH 5.2) with 2 mg/L N6-benzyladenine (BA), 1% glucose and 400 uM 
acetysyringone. The leaf pieces were then fully submerged in 20 ml of A. tumefaciens in 
100 x 25 mm Petri dishes for 30 min, blotted with autoclaved filter paper, then placed on 
solid co-cultivation medium (0.3% Gelrite) and incubated as described above. After 3 days 

10 of co-cultivation, 20-30 segments were transferred to basal shoot induction (BSI) medium 
composed of MS solid medium (pH 5.7) with 2 mg/L BA, 3% sucrose, 0.3% Gelrite, 0- 
200 uM glyphosate, and 400 ug/ml Timentin. 

After 3 weeks, shoots were clearly evident on the explants placed on media 
with no glyphosate regardless of the presence or absence of the gat gene. T-DNA transfer 

15 from both constructs was confirmed by GUS histochemical staining of leaves from 

regenerated shoots. Glyphosate concentrations greater than 20 uM completely inhibited 
any shoot formation from the explants lacking a gat gene. Explants infected with A. 
tumefaciens with the gat construct regenerated shoots at glyphosate concentrations up to 
200 uM (the highest level tested). Transformation was confirmed by GUS histochemical 

20 staining and by PCR fragment amplification of the gat gene using primers annealing to the 
promoter and 3' regions. The results are summarized in Table 7. 



Table 7. Tobacco shoot regeneration with glyphosate selection. 



Glyphosate concentration 
% Shoot Regeneration 


Transferred 
genes 


OuM 


20 uM 


40 uM 


80 uM 


200 uM 


GUS 


100 


0 


0 


0 


0 


gat and 
GUS 


100 


60 


30 


5 


3 



25 
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EXAMPLE 10: GLYPHOSATE SELECTION OF TRANSFORMED YEAST CELLS 



Selection markers for yeast transformation are usually auxotrophic genes 
that allow growth of transformed cells on a medium lacking the specific amino acid or 
nucleotide. Because Saccharomyces cerevisiae is sensitive to glyphosate, GAT can also 
be used as a selectable marker. To demonstrate this, an evolved gat gene (0_6D10) is 
cloned from the T-DNA vector pMAXY3793 (as shown in Example 9) as a Pstl-Clal 
fragment containing the entire coding region and ligated into Pstl-Clal digested p424TEF 
(Gene, 1995, 156: 119-122) as shown in Figure 13. This plasmid contains an E. coli origin 
of replication and a gene conferring carbenicillin resistance as well as a TRP1, tryptophan 
auxotroph selectable marker for yeast transformation. 

The gat containing construct is transformed into E. coli XL1 Blue 
(Statagene) and plated on LB carbenicillin (50 ug/ml) agar medium. Plasmid DNA is 
prepared and used to transform yeast strain YPH499 (Stratagene) using a transformation 
kit (BiolOl), Equal amounts of transformed cells are plated on CSM-YNB-glucose 
medium (BiolOl) lacking all aromatic amino acids (tryptophan, tyrosine, and 
phenylalanine) with added glyphosate. For comparison, p424TEF lacking the gat gene is 
also introduced into YPH499 and plated as described. The results demonstrate that GAT 
activity function will as an efficient selectable marker. The presence of the gat containing 
vector in glyphosate selected colonies can be confirmed by re-isolation of the plasmid and 
restriction digest analysis. 

While the foregoing invention has been described in some detail for 
purposes of clarity and understanding, it will be clear to one skilled in the art from a 
reading of this disclosure that various changes in form and detail can be made without 
departing from the true scope of the invention. For example, all the techniques, methods, 
compositions, apparatus and systems described above may be used in various 
combinations. The invention is intended to include all methods and reagents described 
herein, as well as all polynculeotides, polypeptides, cells, organisms, plants, crops, etc., 
that are the products of these novel methods and reagents. 

All publications, patents, patent applications, or other documents cited in 
this application are incorporated by reference in their entirety for all purposes to the same 
extent as if each individual publication, patent, patent application, or other document were 
individually indicated to be incorporated by reference for all purposes. 
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SEQ ID NO. 


Clone ID 


Sequence 


SEQIDNO:l 


ST401 gat 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

GCGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCTT 

GAAGGCGAAGAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGATACCGTGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAATGCCAGGACATCTGTG 

AGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATAl'Ill' 

GATGTATAAGAAATTGACGTAA 


SEQ ID NO:2 


B6gat 


ATGATTGAAGTCAAACCTATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGCGGATATTATCGGGACAGGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGATATACCGCCGATCGGACCTCATAITl-lG 

ATGTATAAGAAATTGACATAA 


SEQ ID NO:3 


DS3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATAATGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGAAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGATATCTGTG 

AGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGATCTACGACATACCGCCGATCGGACCTCATAi"!"!"!' 

GATGTATAAGAAATTGGCATAA 


SEQ ID NO:4 


NHA-2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATAATGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGAAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGATATCTGTG 

AGCGGCTACTATGAAAAGCTCGGCCTCAGCGAACAAGG 

CGGGATCTACGACATACCGCCGATCGGACCTCATA'lTiT 
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GATGTATAAGAAATTGGCATAA 


SEQ ID NO:5 


NH5-2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 
orryrTTr 1 a rpTr drrvma at a tt a pp a nnnr a a df^Tn a tp 

AGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCTT 

GAGGGCGAAGAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGATACCGTGAGCAAAAAGCGGGAAGCAC 

GGGCAGACCTTTTATGGTGCAATGCCAGGACATCTGTG 
AGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGG 
CGAAGTCTACGACATACCGCCGATCGGACCTCATATTTT 
OATfrTAT A AfrA A ATTfrAfTlTA A 


SEQ ID NO:6 


ST401 
GAT 


MffiVKPINAEDTYEIRHR]LRPNQPI£ACMYETDIXGGAFH 
LGGYYRGKLISIASFHKAEHSEIJEGEEQYQl^GMATLEGY 
REQKAGSTLJRHAERT I -HKKGADLLWCNARTS VSGYYEK 

l^vJI7oJiV^\jrjj* V I xJJJcrLKJrxi IJ^lVl I xVxvx-» 1 


SEQ ID NO:7 


B6GAT 


MIEVKPINAEDTYE1RHRILRPNQPIJBACKYETDLLGGTTH 

LGGYYRDRLISIASFHQAEHSELEGQKQYQLRGMATLEGY 

REQKAGSTLIRHAEELLRKKGADLLWCNARTSVSGYYKK 
t rvTf<5Pnrir;vvT>rppTriP'PTT iv/twitt t 

t-i\jrOdK£\J\J V I i JirruTrnn .ivi x JSJVXj X 


SEQ ID NO:8 


DS3 GAT 


MffiVKPINAEDTYEIRHPJl^NQPI^ACMYETDIiGGTm 

LGGYYRGKLISIASFHNAEHSELEGQKQYQLRGMA 

TI^GYREQKAGSTLIRHAEEIXRKKGADLLWCNARISVSG 

VVPTTT OP5T3r»riOTVT»TPPTrtPWTT \AWV~t A 
I I XjiiVI_)OX^ o JJjv^vjOX x XJLrc xvjrlrxrl 1 1 AvI I JVJvJ ^rv 


SEQ ID NO:9 


NHA-2 
GAT 


IvlTEVKPINAl^TYEniHimJlPNQPI^ACMYETDLLGGTFH 
LGGYYRGKIJSIASFHNAEHSE1£GQKQYQLJRGMATLEGY 
REQKAGSTLIRHAF.ET ,T .RKKGADLLWGNAPJSVSGYYEKL 
GLSEQGGTiTJIPPIGPHIlMYKKLA 


SEQ ID 
NO: 10 


NH5-2 
GAT 


1V1IEVKPINAEDTYEIRHRILRPNQPI£AC1V1YET^ 
LGGYYQGKLISIASFHKAEHSELEGEEQYQLRGMATLEGY 
REQKAGSTL1RHAEELLRKKGADLLWCNARTSVSGYYEK 
LGFSEQGEVYD1PPIGPHILMYKKLT 


SEQ ID 
NO: 11 


13_10F6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

ACLj 111 LALL 1 CUvj 1 vjvjA 1 A 1 1 AUCOvjVjOUAAUC 1 OA 1 C 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 
CGAAGTCTACGACATACCGCCGGTCGGACCrCATATTTT 
GATGTATAAGAAATTGACGTAA 


SEQ ID 
NO: 12 


13_12G6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 
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( 
i 
i 


3AAGGCCAAAGACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACTGGGCCCCATA1TITG 

ATGTATAAGAAATTGACATAA 


SEQID 
NO: 13 


14_2A5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGAGC 

ArOTTTrACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGAAGCAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACGTCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACACACCGCCGGTCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 14 


14_2C1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

nroTTTrArrTCGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGACTGGGCCCCATA'1"1"1"1' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 15 


14_2F11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

ornTTTrArrTTGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGCGCnTCTTCGGAAAAAGGG 

ggcagacctcttatggtgcaacgccaggacatctgcga 
gcgggtactataaAaagctcggcttcagcgaacagggc 
gaagtctacgacacaccgccggccggaccccatatttt 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:16 


CHIMERA 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCITTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGCGAGCAAAAAGCGGGCAGTACG 
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CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATATi'lTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 17 


10_12D7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGNATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

CAGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAG 

GGGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAG 

GCGAAGTCTACGACATACCGCCGACCGGACCCCATATT 

TTGATGTATAAGAAATTGACGTAA 


SEQDD 
NO: 18 


10_15F4 


ATGA1TGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCAGGGGT 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATATiTl' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 19 


10_17D1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

Ar , rtTTT'PAr'PTPOnTr}nATATTAPC , nnGGCAAGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

CGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATA1T1TG 

ATGTATAAGAAATTGACGTAA 


ccn TT) 
NO:20 


10 17F6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

tgagatcaggcaccgcattctccggccgaatcagccgc 

tggaagcatgcaagtatgaaaccgatttgctcgggggc 

acgtttcacctcggtggatattaccggggcaagctggtc 

agcatcgcttcc11tcatcaagccgaacattcagagctt 

gaaggccaaaaacagtatcagctgagagggatggcga 

cacttgaagagtaccgcgagcaaaaagcgggaagcac 

gcttatccgccatgccgaagagcttcttcggaaaaagg 

Igcgcagaccttttatggtgcaacgccaggacatctgcg 
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AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 
CGAAGTCTACGACATACCGCCGGTCGGACCTCATATTTT 
GATGTATAAGAAATTGACGTAA 


SEQID 
NO:21 


10_18G9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACTGATTTGCTCGGTGGC 

ACGTTTCACCTCGGTGGAT ATTACCOGOOC 1 A ACtPTOCtTP 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATAl"!"!'!' 

GATGTATAAGAAATTGACGTAA 


SEQID 

NO:22 


10_1H3 


ATGATTGAAGTCAAACCAATAAACGCGGAGGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACCYTTTCA CCTnC^TnnATAT^ATCCtC^GC'AAClC^CiCiTC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCGAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACCGGACCCCATAl"!"!"!' 

GATGTATAAGAAATTGACATAA 


SEQID 
NO:23 


10_20D10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

A POTTTC A CCTCOCtTGCt AT ATT A CCCtCiGCtC A AGCTG AT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:24 


10_23F2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

JL JL VJi 1VJ VI L. JL ^*J" JL * ■*> JL JL J. JL JL V— * X Vi JLJL X w wVJIi X JL JL JL \J JL > VJ VJ> ^— * X— * ^— 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCTTCCl'lTCATCAAGCCGAACACCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATAl'lllG 
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ATGTATAAGAAATTGACGTAA 


SEQID 
NO:25 


10_2B8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

rGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

rGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATATITIG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:26 


10_2C7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

CtCGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCCTCCITTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTCGAAGGGTACCGTGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAA 

GGCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACACACCGCCGGTCGGACCTCATATT 

TTGATGTATAAGAAATTGACGTAA 


SEQID 
NO:27 


10_3G5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ArrTTTTrACCTCGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGACCGGACCCCATA1T1T 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:28 


10_4H7 


ATGATTGAAGTCAAACCGATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACCGGACCCCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQ ID 


10_6D11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
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NO:29 




TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

AC-OL* 1 1 LAL-U 1 tuu 1 LjCjA 1 A 1 1 ALLuuuuL/AAUL 1 Uu J. 

CAGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCITCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGGTCGGACCTCATA1T1T 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:30 


10_8C6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

LiLvJ 111 CALL 1 LLL 1 OVJ A 1A1 J. ALLUUUULAAUL 1 JL \s 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATAriTl' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:31 


11C3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ALCj 111 CALL 1 LCjLt 1 OvjA 1 A 1 1 AL. U AuuUU AAuv^ i u/i i ^ 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGATATACCGCCGATCGGACCTCATA'ITTT' 

GATGTATAAGAAATTGACATAA 


SEQID 
NO:32 


11G3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCGTGTATGTATGAAACCGATTTGCTCGGGGGC 

ALu 111 LALL 1 LVjrLjUjrvJri 1 A 1 JL /\V^^/\O^^V^/Vr\vJv-, x vxrv l 

CAGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGATATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGGCATAA 


SEQID 

NO:33 


11H3 


atgattgaagtcaaaccaataaacgcggaagatacgta 
tgagatcaggcaccgcatactccggccgaatcagccgc 
Itggaagcatgcaagtatgaaaccgatttgctcaggggt 
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(Trr.TTTT A CCTCCrCVmCi AT A TT A CCOCiClCiC A A fiOTO ATP 

AGCATCGCCTCCTTTCATCAAGCCGAACACCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCAACTGGGCCCCATAl'iriG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:34 


12_1F9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 
a pnTTTr a ncTcnnTrin at att ArrrnTfinr a AnrTfi A TP 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCCACGACATACCGCCGACCGGACCCCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO:35 


12_2G9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 
a c nTTiv a ccTcaacac^ a t a tt a ccaGGGC aagctggt 

jfVV^VJ X X JL V_^/\v v^ JL v_^ vj VJ V VJ VJ ^v X A I l r\\^^vjvjuvj\--fuivv^ i vjvj i 

CAGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACACACCGCCGGTCGGACCTCATA'I'rri' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:36 


12_3F1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTTGGGGGC 

a rnTTTP a rcTcacYTCiCi at att accggggcaagctgatc 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGAAGTAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATA1111 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:37 


12_5C10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTATCGGGGCAAGCTGATC 

AGCATCGCnTCCllTCATCAAGCCGAACATCCAGAGCTT 
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GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACC1T1TATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACGCACCGCCGACCGGACCTCATA1T1T 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:38 


12_6A10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

A CCVTTTC A COTCaCiCCiCi A T A TT A CCCX~lCiCiC A A OPTfiOT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:39 


12_6D1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 
a mTTTr A CC'YCCiCVTCWi ATA TT A COCiC^CidO A A OCTCr A TT* 
AGCATCGCTTCClTiCATCAAGCCGAACATCCAGAGCTT 
GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 
CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 
GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 
GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 
AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 
CGGGGTCTACGACATACCGCCTGTCGGACCTCATATxTx' 
GATGTATAAGAAATTGACGTAA 


SEQID 
NO:40 


12_6F9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTAAGTATGAAACCGATTTGCTCGGGGGT 

a ccvrvTc ArrTPrkTrno at att aocggggc aagctgatc 

agcatcgcctcctttcatcaagccgaacatccagagctt 

gaaggccaaaaacagtatcagctgagagggatggcga 

cactcgaaggataccgcgagcaaaaagcgggaagcac 

gctcatccgccatgccgaagagcttcttcggaaaaagg 

gggcagaccittiatggtgcaacgccaggacatctgcg 

agcggctactataaaaagctcggcttcagcgaacaggg 

cgaagtctacgacataccgccgaccggaccccatatttt 

gatgtataagaaattgacgtaa 


SEQID 
NO:41 


12_6H6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCCTCCTTTCACCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 
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i 


CTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAAGG 1 
CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA [ 
GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 
GAAGTCTACGACATACCGCCGACCGGACCCCATA1T1T 
GATGTATAAGAAATTGACATAA 


SEQID 
NO:42 


12_7D6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACTGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGACCGGACCCCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:43 


12_7G11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCrTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:44 


12F5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGTGGATATTACCAGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCTT 

GAGGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAATGCCAGGACATCTGTGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGATCTACGACATACCGCCGATCGGACCTCATAITITG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:45 


12G7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGTGGATATTACCAGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGATACCGTGAGCAAAAAGCGGGAAGCAC 

ACTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTG 
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AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 
CGAAGTCTACGACATACCGCCGATCGGACCTCATA'ITIT 
GATGTATAAGAAATTGACGTAA 


SEQID 
NO:46 


1_2H6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

GTGTTTC ArCTrGGTGGATATTACCGGGGC AAGCTGATC 

AGCATCGCCTCCTITCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGATCGGACCTCATAITIT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:47 


13_12G12 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 
a rnTTTr a rnr onTrso ata tt a ccnaaac a a ac^ra a Tr 1 

/\l_-v_J 1 1 X WILL JL^VJTUJL Uu/il /\ A 1 /^l^v_-V_JVJv_Jvj\^VAjr\vJV^ J. X 

AGCATCGCTTCCTTTAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGGTCGGACCTCATATTTT 

GATGCATAAGAAATTGACGTAA 


SEQID 
NO:48 


13_6D10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTCGCTCGGAGGC 

ac*cytttc ArrTrnnTnG at att accggggc aagctgatc 

AGCATCGCTTCCTTTAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGGTACCGTGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACACACCGCCGGTCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:49 


13_7A7 


ATGATCGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCAGGAGT 

GCGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCCTCCTTTCACCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGGGGGATGGCG 

ACACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGTA 

CGCTTATCCGCCATGCCGAAGAGCrTCTTCGGAAAAAG 

GGGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACACACCGCCGGTCGGACCTCATATT 
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r 


ITGATGTATAAGAAATTGACGTAA 


SEQID 
NO:50 


13_7B12 , 

r 
t 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

rGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

rGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGAGC 

a criTTTC A CCTCClCiTClCi AT ATT ACCGGGGC A AGCTGATC 

A.GCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGCGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGACrGGGCCCCATAl'l"!"! 

GATGTATAAGAAGTTGACGTAA 


SEQID 
NO:51 


13_7C1 


ATGATTGAAGTCAAACCAATAAATGCGGAAGATACGTA 
TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 
TTGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAACTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGTAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCGA 

GAGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GAAGTCTACGACATACCGCCGACTGGGCCCCATA'l'l'llG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:52 


13_8G6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTCGCTCGGGGGC 

a mmr Af^PTfrMTPfrfr AT ATT ACCGGGGC AAGCTGAT 

CAGCATCGCnTCCTTTAATCAAGCCGAACATCCAGAGCT 

TGAAGGTCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACGTCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATA'l'l'i'lG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:53 


13_9F6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATCTGCTTGGGGGC 

a rrTTTr a rrT a rwTTYTfi ATA TT a CCfififrOC A AGCTG AT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGTA 

CGCTTATCCGCCATGCCGAAGAGCxTCTTCGGAAAAAG 

GGGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCGGTCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 


14_10C9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
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NO:54 




rGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

rAGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

rir , riTTT , PAr , r^prw^Trk^ATATTAPr i orrCKjCAAGr(TrGATC 

AGCATCGCTTCCTTTCATCAAGCTGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACGTCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACACACCGCCGGTCGGACCTCATAT1"1"1' 

GATGTATAAGAAGTTGACGTAA 


SEQID 
NO:55 


14_10H3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

^/"•riTTTT' a rT^rrTincnn at att ACcanciCiC 1 A AGPTGGT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAA 

GGCGCAGACCTrTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACACACCGCCGGTCGGACCTCATATT 

TTGATGTATAAGAAGTTGACGTAA 


SEQID 
NO:56 


14_10H9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

rrr.TTTr a rrTrcicvraCi AT ATT ArPGGGGCAAGCTGGTC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATATTTTG 

ATGTATAAGAAATTGACATAA 


SEQID 
NO:57 


14_11C2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGAGC 

a nnTrrn a nr^rraamci AT ATT APCGGGGCAAGCTGGT 

CAGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACACACCGCCGACCGGACCCCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO:58 


14_12D8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTAAGTATGAAACCGATTTGCTCGGGGGT 



- 127- 



WO 02/36782 



PCT/US01/46227 







CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGATACCGTGAGCAAAAAGCTGGCAGTAC 

GCTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCG 

AGCGGCTACTATAAAAAGCTCGGCTTCAGGGAACAAGG 

CGGGGTCTACGACATACCGCCTGTCGGACCTCATA'l'l"!"!' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:59 


14_12H6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTGTGGTGCAACGCCAGGACGTCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACTGGGCCCCATA1T1TG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:60 


14_2B6 


ATGATTGAAGTCAAACCAATAAATGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 
a pptttp a rTTPnnTnn a t a tt a rmnrinp A a pjp^tPy A TP 1 

ALU 111 LALL 1 Luu 1 UuA 1A1 1 /\U^uuuu^a/\vju x \Jr\ l \^ 

AGCATCGCTTCCTTTAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACGTCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATA1T11G 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:61 


14_2G11 


ATGATTGAAGTCAAACCAATAAATGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 
ppptttp a ppTPnnTnn a tatta c^c^aaaciC* A A OP^TPrPrTP 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCGA 

GTGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACTGGGCCCCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:62 


14_3B2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCAGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCCTCCrTTCATCAGGCCGAACATCCAGAGCTT 
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GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGAAGCAC 

GCTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGCCGGACCTCATAITIT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:63 


14_4H8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGAGC 
a r^nTrrc a (^f^ronnr^nn at att a (^cnnnac a Acicm at 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTCGAAGGGTACCGTGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAA 

GGCGCGGACCTTTTGTGGTGCAACGCCAGGACGTCTGC 

GAGCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACACACCGCCGGTCGGACCTCATATT 

TTGATGTATAAGAAATTGACGTAA 


SEQID 
NO:64 


14_6A8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

AC^Qj 111 LALL 1 v^Ljvj 1 vjLjA 1A11 ALLUUUULAAU^ I AU 1 ^ 

AGCATCGCTTCCTTTAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATG'I'l'l'l'G 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:65 


14_6B10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

AC^Cj 111 LALL 1 1 vjVj 1 uuA 1 A 1 1 /iLUuUuUL,/uvvjv^ x \Jt\ l ^ 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATGCCGCCGGTCGGACCTCATATTTTG 

ATGTATAAGAAGTTGACGTAA 


SEQID 
NO:66 


14_6D4 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGACCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGAGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 
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CTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATATITIG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:67 


14_7A11 


ATGATTGAAGTCAAACCAATAAACGCGGAGGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 
nccvvvvc a crTcndTnci at a tt a ccaaacic a a ac^TcicvTC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGC1T 

GAAGGCCTAAAACAGTATCAGCTGAGAGGGATGGCGAC 

ACTCGAAGGGTACCGTGAGCAAAAAGCGGGAAGTACG 

CTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACGTCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGACCGGACCTCATA'Il'i'l' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:68 


14_7A1 


ATGATTGAAGTCAAACCAATAAACGCGGAGGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 
f-* r^r** i ■ i ■ i '/-i a c y c* r vc^nnjmc± ata tt a ppAnnnp a a c\c^ r YCxrwc^ 

111 LALL 1 L^vJVJ 1 VjVJ/\ J.A1 1 ALLUUUULAnuV/ 1 OvJ 1 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCTAAAACAGTATCAGCTGAGAGGGATGGCGAC 

ACTCGAAGGGTACCGTGAGCAAAAAGCGGGAAGTACG 

CTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACGTCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGACCGGACCTCATAIT1T 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:69 


14_7A9 


ATGATTGAAGTCAAACCAATAAACGCGGAGGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

a rTVTTTC A CfTCndT'Cin A T ATT A CCCiCiCiCiC A A OTTfJGTC 

AGCATCGCCTCCTTTCATCAAGCCAAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGGTACCGTGAGCAAAAAGCGGGTAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACGTCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGTCGGACCTCATA'IT'ITG 

ATGTATAAGAAATTGACGTAA 


NO:70 


14 7G1 

A *T / VJ JL. 


ATGATTGAAGTCAAACCAATAAACGCAGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTITAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGTTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGTGAGCAAAAAGCGGGAAGTACG 

CTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCGA 
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GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 
GAAGTCTACGACACACCGCCGGTCGGACCTCATAITITG 
ATGTATAAGAAATTGACGTAA 


SEQID 
NO:71 


14_7H9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACCjl 1 ICACCICGGLAjGAI Al IAt^CGGGUCAAUC.lGlJl 

CAGCATCGCTTCCll'l'CATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGATACCGTGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAA 

GGCGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCGGTCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO:72 


14_8F7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGGT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCTGAAGCGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCA 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGACTGGGCCCCATAT1TT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:73 


15_10C2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTCGGTGGATATTACCGGGGCAAGC 1 Uu 1 C 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGAAGTACG 

CTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGGCAGGACAACTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGT 

GAAGTCTTCGACATACCGCCGACCGGACCCCATAT1T1G 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:74 


15_10D6 


atgattgaagtcaaaccaataaacgcggaagatacgta 
tgagatcaggcaccgcattctccggccgaatcagccgc 

acgtttcacctaggtggatattaccggggcaagctggt 

cagcatcgcctcctttcatcaagccgaacatccagagct 

tgaaggccaaaaacagtatcagctgagagggatggcg 

acacttgaagagtaccgcgagcaaaaagcgggaagca 

cgctcatccgccatck:cgaagagcttcttcggaaaaag 

ggggcagacctcttatggtgcaacgccaggacatctgc 

gagcgggtactataaaaagctcggcttcagcgaacagg 

gcgaagtctacgacataccgccggtcggacctcatattt 
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rGATGTATAAGAAATTGACGTAA 


SEQID 
NO:75 


15_11F9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

rGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

rGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

nrciTTTC* a rTTTrtOTOfr a t att a ccggggp a agctggtc 

AGCATCGCCTCCTTTAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAGAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACCGGACCCCATA1T1T 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:76 


15_11H3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

nrmrrr a rr^mncvTCin at att a PCGGGGP A AGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACACCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCAACTGGGCCCCATA'lTiTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:77 


15_12A8 


ATGATTGAAGTCAAACGAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

a nnrrrm a rv^rviirJTf^ATATTAPCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACCGGACCCCATATlTi 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:78 


15_12D6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCAGGGGT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTAC 

GCITATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACACACCGCCGGTCGGACCTCATATTTT 

GATGTATAAGAAGTTGACGTAA 


SEQID 


1 15_12D8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
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NO:79 




TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACG1 1 1 CACCTCGGCGGA 1 A rTACCGGGGCAAGC I GOT. 

CAGCATCGCCTCCT1TCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAACTGAGAGGGATGGCG 

ACACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACGTCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CAAAGTCTACGACATACCGCCGGTCGGACCTCATAT1TT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:80 


15_12D9 


ATGATTGAAGTCAAACCAATAAACGCGGAGGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGGT 

CAGCATCGCCTCCI'I'ICATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTCGAAGAGTACCGCGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAG 

GGGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCGGTCGGACCTCATATTT 

TGATGTATAAGAAATTGACATAA 


SEQID 
NO:81 


15_3F10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTC ACCTTGGTGGATATTACCGGGGC AAGG 1 GA 1 G 

AGCATCGTTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGCACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACACACCGCCGGCCGGACCTCATA'l"!"!"!' 

GATGTATACGAAATTGACGTAA 


SEQID 
NO:82 


15_3G11 


ATGATTGAAGTTAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

TTGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTC ACCTCGGCGGATATTACCGGGGCAAGC 1 uu 1 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTGTGGTGCAACGCCAGGACGTCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGGTCGGACCTCATAIll'l 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:83 


15_4F11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TAAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 
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ACGl 1 ICACCICGGIGGAI Al IACAAjGGGLAAGCIGGIC 

AGCATCGCTTCCT1TAATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGCGCTTCTTCGGAAGAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACCGGACCCCATA1T1T 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:84 


15_4H3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGl 1TCACCTCGGCGGAT ATT ACCGGGGCAAGCTGGT 

CAGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGTA 

CGCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAA 

GGCGCGGACC1T1TATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCGACTGGGCCCCATATT 

TTGATGTATAAGAAATTGACGTAA 


SEQID 
NO:85 


15_6D3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGTi TCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACACCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGACCGGACCCCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 86 


15_6G11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

OOOL,/\vj/\L^ 111 vjjkj l LjL,AAL.LjL^A^LtAL^A 1^1 vjL.Lt 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 
CAAAGTCTACGACATACCGCCGGTCGGACCTCATATTTT 
GATGTATAAGAAGTTGACGTAA 


SEQID 
NO: 87 


15_9F6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 
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TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTCGAAGAGTACCGCGAGCAAAAAGCGGGCAGTA 

CGCTTATCCGCCATGCCGAAGAGCTTCTTCGGAGAAAA 

GGCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCTGTCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO:88 


15F5 


ATGATCGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ALU 111 LALL 1 COG 1 GOG 1 AC 1 ACCGGGGC AAOC 1 OA 1 

cagcatcgcttccrricataaagccgaacattcagagct 

tgagggcgaagaacagtatcagctgagagggatggcg 

acgcttgaaggataccgtgagcaaaaagcgggcagtac 

gcttatccgctatgccgaagagcttcttcgaaaaaaag 

gcgcggaccttttatggtgcaacgccaggacatctgtg 

agcgggtactataaaaagctcggcttcagcgaacaggg 

cgaagtctacgacataccgccgatcggacctcatahit 

gatgtataagaaattgacgtaa 


SEQID 
NO:89 


16A1 


ATGATTGAAGTCAAACCTATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ACGC ITCACCTCGG 1 OGATA 1 1 ACC AGGGC AACjC 1 G A 1 

CAGCATCGCTTCCTTTCATAAAGCCGAACATTCAGGGCT 

TGAGGGCGAAGAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTCGAAGGGTACCGCGAGCAAAAAGCGGGCAGTA 

CGCTTATCCGCCATGCCGAAGAGCTTCTTCGAAAAAAA 

GGCGCGGACCTTTTATGGTGCAATGCCAGGACATCTGT 

GAGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCGATCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO:90 


16H3 


ATGATTGACGTCAAACCTATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGCGGATATTACCAGGGCAAGCTGAT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAG 

GGGGCAGAGCTTTTATGGTGCAATGCCAGGACATCTGT 

GAGCGGGTACTATGAAAAGCTCGGCTTCAGCGAACAGG 

TGATGTATAAGAAATTGACGTAA 


SEQID 
N0:91 


17C12 


ATGATTGAAGTCAAACCAATAAGCGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

GCGTTTCACCTCGGTGGATATTACCAGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 
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GCTTATCCGCCATGCCGAAGAGCTTCTTCGAAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGTG 

AGCGGGTACTATGAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:92 


18D6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

A C s C¥V r T t TC* A C y C* r TC K C\C¥TC\C\ ATA TT A PrHfinnr A A C\C ir YC^ A TP 

AGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCAA 

CGCTTGAAGGATACCGTGAGCAAAAAGCGGGAAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATAl-riTG 

ATGTATAAGAAATTGGCATAA 


SEQID 
NO:93 


19C6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

A PriTTTP A C*C*TC*CiC¥mCl ATA TT A C^C^CICXCXCIC* A A PPTPi A TY"" 1 
Aw 111 AI^L^ 1 V^OO 1 OLr/Y 1/\1 1 ALLuuuuLA AVjrL^ 1 UrA 1 

TGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAATGCCAGGACATCTGTG 

AGAGGCT ACTATG A AA AGCTCGGCTTC AGCGAAC AAGG 

CGGGGTCTACGATATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGGCGTAA 


SEQID 
NO:94 


19D5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACTGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

A PPTTTP A PPTPOriTHH ATA TT A fY* 1 A nnCir** A A PPTP A TP 

AGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAATGCCAGGACATCTGTGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATIT1G 

ATGTATAAGAAATTGACGTAA 


NO:95 




A TO ATTfr A A fiTC AAA CC A AT A A A COPOO A A GAT A CftTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGTGGATATTACCAGGGCAAGCTGATC 

AGCATCGCrTCCTTTCATAATGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGTGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTrCGGAAAAAGG 

GGGTAGACCTTTTATGGTGCAACGCCAGGACATCTGTG 
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AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 
CGGGATCTACGACATACCGCCGATCGGACCTCATA1T1T 
GATGTATAAGAAATTGGCATAA 


SEQID 
NO:96 


20F2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCITATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTG 

AGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:97 


2.10E+12 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

uUjl 1 ICACCICGGIGGAI Al 1 ACC AGGGCA AGUTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGATACCGTGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATA'ITrr 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:98 


23H11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAGGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ALul 1 IL-AL-CICUGIGGAIAI IACCAGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTCCGAAAAAAAGG 

CGCGGACCTTTTATGGTGCAATGCCAGGACATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCACCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGGCATAA 


SEQID 
NO:99 


24C1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
r vc\r\ a a or 1 a thp a a nT a th a a a r^r^r^ a t T r T r rru~vrc y ciCicicicic* 

1 UU/y\UL./\ 1 UL/\/\u 1/\1 0/V/\/\v^v^O/\ 111 VjrV^ 1 V^OOvjOLjI^ 

ACGTTTCACCTCGGCGGATATTATCGGGACAGGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCIT'ITATGGTGCAACGCCAGGACATCTGTG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATATTTTv 
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GATGTATAAGAAACTGACGTAA 


SEQID 
NO: 100 


24C6 


ATGATTGAAGTCAAACCTATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

Av-.Lt 111 LALL 1 CvjLt 1 LjLjA 1 A 1 1 ACCLtvjLtOU AAvjL-. 1 vjrA 1 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGAAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGATATCTGTG 

AGCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGGCATAA 


SEQID 
NO: 101 


2.40E+08 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAGGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

AUG ill LA 1 C 1 CGG I GGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATAATGCCGAACA1TCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGATACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAATGCCAGGACATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGGCATAA 


SEQID 
NO: 102 


2_8C3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGTATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

AGG1 1 1GAGCTCGGCGGATATTATCGGGACAGGCTGATC 

AGX3ATCGKZCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 103 


2H3 


ATGATTGAAGTCAAACCGATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGll TCACCTCGGTGGATATTACCAGGGCAAGCTGATC 
Ann a rrnrTTrrTTTP A tp a a or^pnn apa ttp aha nr^nnr 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCGAAAAGCGGGAAGTAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGATATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGATATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 


30G8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
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NO: 104 




TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTTTGAAACCGATTTGCTCGGGGGTG 

Lu 111 LALL 1 CLrvj 1 (jLiA 1 A 1 1 ALL ALivjvjtL MuL 1 (jrA 1 LA 

GCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTTG 

AAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGAC 

GCTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACGC 

TTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGGC 

GCAGACCTTTTATGGTGCAACGCCAGGACATCTGTGAG 

CGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGCG 

AAGTCTACGACATACCGCCGATCGGACCTCATATITTGA 

TGTATAAGAAATTGACGTAA 


SEQID 
NO: 105 


3B_10C4 


ATGATTGAAGTCAGACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGTATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTC ACC 1 CGG 1 GG AT ATI ACCGGGGC AAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGCCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 106 


3B_10G7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ALu 111 LACL 1 Luu 1 GGA 1 ATI ACCGGGGC AAGCTGA 1 C 

AGCATCGCCTCCTTTCATCAAGXrCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGATCGGACCCCATA1TT1G 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 107 


3B_12B1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 
ACGTTTCACCTCGGTGGATATTAC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 
GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 108 


3B_12D10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGTATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTACGAAACCGATTTGCTCGGGGGT 
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uLu 111 CAL.C 1 CLiIj 1 GGA 1 A 1 1 A<^1_-vjLj<j<jLAAIjU 1 GA 1 (_ 

AGCATCGCCTCCl'llCATCCAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CITATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACC1"1"1"1ATGGTGCAACGCCAGGATATCTGCGA 

GCGGGTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCCCATA1T1TG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 109 


3B_2E5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACG1 1 ICACCIMjGIGGAIAI rACCGGGGLAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCAAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATA1 , 1"1"1G 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 110 


3C_10H3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGTATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACG1 1 1 CACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGAAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGATATCTGCGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATAl'lTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:lll 


3C_12H10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGGGCAAAAAGCGGGCAGTACG 

CITATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 
GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 
ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 112 


3C_9H8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGTATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGCGGATATTATCAGGACAGGCTGATC 

AGCATCGCCTCCTITCATCAAGCCGAACATTCAGAGCTT 
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GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCTATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGATATCTGCG 

AGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 113 


4A_1B11 


ATGATTGAAGTCAAACCTATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

AL*vj 111 LALL 1 L>ljrO 1 A 1 A 1 1 ACCLtLjLjvjU AAvjC 1 CjA 1 C 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO.-114 


4A_1C2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ALO 111 GAv_C 1 CUGCGGA 1 A 1 1 A 1 CGGGGCA AGC I GATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:115 


4B_13E1 


ATGATTGAAGTCAAACCTATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

Av_,tjr 111 LAtL 1 COG 1 GGA 1 A 1 1 AGGGGGGCAAGUTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTGTGGTGCAACGCCAGGATATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

G-A AGTCT ACGAC? AT ACCCtCCCt A TCCiCr A CTTC* A T A ' 1 ' r I * 1 'Ct 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 116 


4B_13G10 


TTACGTCAATTTCTTATACATCAAAATATGAGGTCCGAT 

CGGCGGTATGTCGTAGACTTCGCCCTGTTCGCTGAAGCC 

GAGCTTTTTATAGTACCCGCTCGCAGATGTCCTGGCGTT 

GCACCATAAAAGGTCCGCGCCTTTTTTCCGAAGAAGCTC 

TTCGGCATGGCGGATGAGCGTGCTTCCCGCTTTTTGCTC 

GCGGTACCCTTCAAGCGTCGCCATCCCTCTCAGCTGATA 

CTGTTTTTGGCCTTCAAGCTCTGAATGTTCGGCTTGATG 
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AAAGGAGGCGATGCTGATCAGCTTGCCCCGGTAATATC 

CACCGAGGTGAAACGTGCCCCCGAGCAAATCAGTTTCA 

TACTTGCATGCTTCCAGCGGCTGATTCGGCCGGAGAATG 

CGGTGCCTGATCTCATACGTATCTTCCGCGTTTATTGGT 

TTGGCTTCAATCAT 


SEQID 
NO: 117 


4B_16E1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ALAjrl 1 ICACCICuIjCAjvjAI Al 1 ALLuuUuLAAuL 1 CjrAl 

CAGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGATATACCGCCGATCGGACCTCATAiTl'l' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 118 


4B_17A1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACCjI 1 1CACCICCj(jCCjOA1A1TACCO(j<jGCAAG(J1GAT 

CAGCATCGCTTCCT1TCATCAAGCCGAGCATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATA1T1T 

GATGTATAAGAAATTGACATAA 


SEQID 
NO: 119 


4B_18F11 


ATGATTGAAGTCAATCCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

A(_tj 1 CI CACC 1 CCKjCGCjA I A I ± ACCtjCGGCAAGC 1 CiAT 

CAGCATCGCTTCCTTTCATAATGCCGAACATTCAGAGCT 

TGATGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTG 

AGCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATATTTC 

GATGTATAAGAAATTGACGTAA 


NO: 120 




A TCI A TTn A APtTPA A APPA ATA A AOr^^OPlA APrATAPPiTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAG 

GGGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGC 
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GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAG 
GCGGGGTCTACGATATACCGCCGATCGGACCTCATATTT 
TGATGTATAAGAAATTGGCATAA 


SEQID 
NO: 121 


4B_1G4 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

LjK^kj 111 LALU 1 C0l_rL-0O/\ 1A1 1 /\Ov^OrOOOL^/Y/\Lxl^ 1 uA 1 

CAGCATCGCCTCCTTTCATCAATCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGGGTACCGCGAGCTAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGATATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 122 


4B_21C6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACIj 111 LALC 1 LAjur 1 IjVjA 1 A 1 1 AUt-LitjrvjttjtL.AAtjC 1 OA 1 

AGCATCGCTTCCTITCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGATATCTGCG 

AGCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGATATACCGCCGATCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 123 


4B_2H7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

At_Lr 111 LALL 1 LUu 1 uuA 1 A 1 1 A\_V_-<JtjljOl_ AAUL 1 \JA 1 l~ 

AGCATCGCCTCCl'lTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTACCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGGCATACCGCCGATCGGACCTCATA'lTllG 

ATGTATAAGAAATTGACATAA 


SEQID 
NO: 124 


4B_2H8 


ATGATTGAAGCCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TYV1A AOPATHPA A fVT A Tdr A A A r^TT^ A HTT^r^f^Tr^r^rir^f^r^f^ 
1 vjrw/\/\Ov_^/V 1 \JK^r\J\\J 1 f\x 1 Ur/\ 111 Kj*^, 1 L.uUUUuU 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATA'ITIT 
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GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 125 


4B_6D8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACLr 111 LAtL 1 CIjO 1 oUA 1 A 1 1 ACUOoOoUAAOC 1 OA 1 C 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGTAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACATGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 126 


4B_7E8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCATGTATGAAACCGATTTGCTCGGGGGC 

ACO 111 LALL 1 CLrvj 1 CjLr A 1 A 1 1 ACCCjCjCjGC AAGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 127 


4C_8C9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCAGGGGT 

Lrf-Lr ill CALt 1 ULnj I OGA 1 A 1TACCGGGGC A AGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTAACATAA j 


SEQID 
NO: 128 


4H1 


ATGATTGAGGTGAAACCGATTAACGCAGAGGAGACCTA 

TGAACTAAGGCATAGGATACTCAGACCACACCAGCCGA 

TAGAGGTTTGTATGTATGAAACCGATTTACTTCGTGGTG 

Co ill LAL 1 1 ALxGCCjGC 1 TTTAC aggggc aagctg attt 

CCATAGCnTCATTCCACCAGGCrfTA<*Tr , ATrr , AnAArTPr 

AGGGCCAGAAACAATACCAACTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGACCAGAAAGCGGGATCGAGCCT 

AATTAAACACGCTGAACAGATCCTTCGGAAGCGGGGGG 

CGGACATGCTATGGTGCAATGCGCGGACATCCGCCGCT 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCGTAA 

TGTATAAACGCCTCACATAA 


SEQID 


6_14D10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
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NO: 129 




TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

AGCATCGCCTCCTTCCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCATAAACAGTATCAGCTGAGAGGGATGGCGAC 

ACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCACG 

CTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 130 


6_15G7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTAAGTATGAAACCGATTTGCTCGGGGGC 

CAGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAA 

GGCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCGGTCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO: 131 


6_16A5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 
a r % rw r r r rr* a r^r^Tr^nnTnn at a tt a rrnnnnr a a hpth a tc* 

AGCATCGCCTCCTTTCACCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGAAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATA'llllG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 132 


6_16F5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

A fTTTTTf"* A PPTPr»nTfJfJ ATA TT A f^r^flrddCH^ A A HPTft A TP 1 

AGCATCGCTTCCTTTCATCAAGCCGTACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

C ACTTG A AGO ATACCGTG AGO A A A A AGCGGGC AGT ACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATA1T1TG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 133 


6_17C5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGCAAGTATGAAG<:CGATTTGCTCGGGGGC 
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AL-G1 1 ILALLlLuuluuAlAl 1 ALA^JtoLiGCAAGC 1GA1L- 

AGCATCGCTTCCTTTCATCAAGCCGAGCATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGAAACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACGTACCGCCGATCGGACCTCATA'l'rri' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 134 


6_18C7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAGGTATGAAACCGATTTGCTCGGGGGC 

ACGi i 1CACC1CGG1GGA1 Al 1A1CGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGATATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTTTACGACATACCGCCGGTCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 135 


6_18D7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCMCCGCATTCTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

A i~^t — " 1 u 1 u 1 A ^^^-^TV* V / »' 1 VV A T 1 A ' 1 " 1 1 A / — — ' *^ f~< 1 — ' A A i"*/*TTV1 A TV 

ACGTTTCACCTCGGTGGATATTACCGGGGCA^ 

agcatcgcctcctttcatcaagccgaacatccagagctt 

gaaggccaaaaacagtatcagctgagagggatggcga 

cacttgaagggtaccgcgagcaaaaagcgggaagcac 

gctcatccgccatgccgaagagcttcttcggaaaaaag 

gk:gcggaccttttatggtgcaacgccaggacatctgcg 

agcgggtactataaaaagctcggcttcagcgaacaagg 

cggggtctacgacataccgccggtcggacctcatal'lti' 

gatgtataagaaattgacgtaa 


SEQID 
NO: 136 


6_19A10 


ATGATTGAAGCCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

a v/ i'i»i'i v a o/'rryi/^/ i' i '/~*r~* a T 1 A 'i'i' A f*r"*f*f~*f~*£~ir^ a a /-'/""vrv"' a tv» 

ACGI 1 1CACCTCGGTGGATA1 1 ACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 
CGAAGTCTACGACATACCGCCGACCGGACCCCATATTTT 
GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 137 


6_19B6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTCGGTGGATATTATCGGGGCAAGCTGATC 

AGCATCGCTTCCnTTCATCAAGCCGAACATTCAGAGCTT 



- 146- 



WO 02/36782 



PCT/US01/46227 







GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTCGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

CGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGGTCGGACCTCATA1T1TG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 138 


6_19C3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACOl 1 ICACCTCOOCOGATATTACCGOGGCAAGCrGAT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATA'l'l'l"!' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 139 


6_19C8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGTTACACCTCGGTGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCAAGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTl'l'lATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATATiTl' 

GATGTATAAGGAATTGACGTAA 


SEQ ID 
NO: 140 


6.20A7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCAGGGGC 

ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCTTCCTTTCATCAAGCCGAACATTCAGATCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACITGAAGAGTACCGCGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAG 

GGGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO: 141 


6_20A9 


ATGATTGAAGTCAAACCAATAAACGCGGGAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 
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CTTATCCGCCATGCCGAAGAGCTTCTACGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 142 


6_20H5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 
a rnTTTr a c y c v Tc y aciC'C\ci at a tt a r^c^nnnnr* a a rir^rn a t 

AI^VJ 111 l^/\V-^ 1 LUuLuuA 1 A 1 1 ALUuuuuLAAuL 1 KJJ\ 1 

CAGCATCGCCTCCT1TCATCAAGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGATACCGTGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATATITT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 143 


6_21F4 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCGTTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

rir*ry~r r T r rr~* a ppTrn^Tnn at a tt a pprinnnp a a ar^m a t^ 
IjtUvjt 111 LALL 1 L^OVjr 1 vjrUr/\ 1/\1 1 /V^I^LjoOLjv^ A AUL 1 LjA 1 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACGTACCGCCGGTCGGACCTCATA'ITITG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 144 


6_22C9 


atgattgaagtcaaaccaataaacgcggaagatacgta 
tgagatcaggcaccgcattctccggccgaatcggccgc 
ttgaagcatgtatgtatGaaaccgatttgctcgggggc 

ACCjt 111 LALL 1 Ll_rfj 1 LrljA 1 A 1 1 ALLuvjLiIjLAAvjL 1 LA 1 l_ 

agcatcgcctcctttcatcaagccgaacatccagggctt 

gaaggcaaaaaacagtatcagctgagagggatggcga 

cacttgaagagtaccgcgagcaaaaagcgggaagcac 

gcttatccgccatgccgaagagcttcttcggaaaaaag 

gcgcggaccttttatggtgcaacgccaggacttccgcg 

agcgggtactataaaaagctcggcttcagcgaacaagg 

aggggtctacgacataccgccggtcggacctcata1t1t 

gatgtataagaaattgacgtaa 


ecn TH 
oJDl^ lXJ 

NO: 145 




ATOATTfrA AfrTP A A APT 1 A ATA A APOfTKrA AfrATAfnTA 

TGAGATCAGGCACCGTATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCATGTATGAAACCGATTTGCTCGAGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAGCATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 
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GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 
GAAGTCTACGACATACCGCCGGTCGGACCTCATAITITG 
ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 146 


6_22H9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGATGAGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCCCATAT1T1G 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 147 


6_23H3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGGAACTGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGGATCGGTTCCTTTCATCAAGCCGAGCAACCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAGCAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATA'll'lTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 148 


6_23H7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGATACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCAGAAGAGATTCTTCGGAAAAAAG 

GCGCGGACCTCTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATA'ITIT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 149 


6_2H1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCGTTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACCGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAATCTACGACATACCGCCGATCGGACCTCATA1T1TG 
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ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 150 


6_3D6 


ATGATTGAAATCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ALu 111 C ACC 1 CGG 1 GGA 1 A 1 1 ACCOAGGL, AAGC 1 OA i C 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CTCTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTrATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAGGTCTACGACATACCGCCGGTCGGACCTCATA'lTiTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:151 


6_3G3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACOTTTCACCTCGG 1 GGATATTACCGGGGC AAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGGTCGGACCTCATAlTriG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 152 


6_3H2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACu 1TTC ACC rCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGGTCGGACCTCATAlTri' 

GATGTATAAGAAATTGACATAA 


SEQID. 
NO: 153 


6_4A10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 
ACGll TCACljrCGGTGGAT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGATACCGTGAGCAAAAAGCGGGAAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGGTCGGACCTCATATITIG 

ATGTATAAGAAATTGACGTAA 


SEQID 


6_4B1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
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NO:154 




TGAGATCAGGCACCGCGTACTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

GGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGGACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATA'ITITG 

ATGTATAAGAAATTGACATAA 


SEQID 
NO: 155 


6_5D11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCnTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:156 


6_5F11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTAATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCCACGACATACCGCCGGTCGGACCTCATAITIT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:157 


6_5G9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTAAT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGAGTACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGATATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 158 


6_6D5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATGCGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACTGATTTGCTCGGGGGC 
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ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTAC 

GCrTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATATl'IT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 159 


6_7D1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCAGGGGT 
GGGTTTGAPrTrGGTGGATATTAPCf^GGPA AGPTGATP 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATAllTIG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 160 


6_8H3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 
AGGTTTCAf , rT(~ 1 GGTGGATATTAPC'nGGGr , A AfrrTfrATP 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 161 


6_9G11 

( 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 
A CCICTTC A rrTrfifiTYiGATATT A (TCiaClCiC A A ("WTO A T 

CAGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTA 

CGCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAA 

GGCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAG 

GCGAAGTCTACGACATACCGCCGGTCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO: 162 


6F1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGGTC 

TGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCTT 
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GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGATGGATACCGCGAGCAAAAAGCGGGAAGCACG 

CTCATCCGCCATGCCGAAGAGCTTCTTCGAAAAAAAGG 

CGCGGACCTTTTATGGTGCAATGCCAGGACATCTGTGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGGTCGGACCTCATA1T1TG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 163 


7_1C4 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAGCATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGATATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQDD 
NO: 164 


7_2A10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACTGATTTGCTCGGGGGC 
ACGTTTCATCTCGGTGGATATTACCGGGGCAAGCTGATC 

agcatcgcctcctttcatcaagccgaacatccagagctt 

gaaggccaaaaacagtatcagctgagagggatggcga 

cgcttgaagggtaccgcgagcaaaaagcgggaagcac 

gctcatccgccatgccgaagagcttcttcggaaaaaag 

gcgcggaccttttatggtgcaacggcaggacatctgcg 

agcgggtactataaaaagctcggcttcagcgaacaagg 

CGGGGTCTACGATATACCGCCGATCGGACCTCATAl'l'll' 
GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 165 


7_2A11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATA'ri"l"l' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 166 


7_2D7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGTGAGCAAAAAGCGGGAAGTACG 
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CTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GTGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGGTCGGACCTCATA1T1TG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 167 


7_5C7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGTGGGAAGCACG 

CTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCmTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGATATACCGCCGGTCGGACCTCATA'ITITG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 168 


7_9C9 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAAATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTCATCCGCCATGCCGAAGAGCTTCTACGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATAT1TT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 169 


9_13F10 


ATGATTGAAGTCAAACCAATAAACGCGGAGGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTTGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGACTGGGCCCCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 170 


9_13F1 


ATGATTGAAGTCAAACCAATAAACGCGGAGGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCGTTTCACCTTGGTGGATATTACCGGGGCAAGCTGGTC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCG 
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AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 
CGAAGTCTACGACATACCGCCGACTGGGCCCCATA1T1T 
GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 171 


9_15D5 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGACGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTCTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGGTCGGACCTCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 172 


9_15D8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGGT 

CAGCATCGCCTCCTTTCATCAAGCTGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGCGCTTCTTCGGAAGAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACACACCGCCGGTCGGACCCCATATTTT 

GATGTATAAGAAGTTGACGTAA 


SEQID 
NO: 173 


9_15H3 


ATGATTGAAGTCAAGCCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATATGCTCAGGGGT 

GCGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCACGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTTAGCGAACAGGG 

CGAAGTCTACAACACACCGCCGGTTGGACCTCATA'l'ri'r 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 174 


9_18H2 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGCGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGTAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACA 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGGTCGGACCTCATATTTTG 



-155- 



WO 02/36782 



PCT/US01/46227 







ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 175 


9_20F12 


ATGATTGAAGTAAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCGTTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

a c^ri r r r T Ji r{~ % a rrTrnp»Tnn a nr a tt a c^c^cxciczacci a nr^mryrr* 
Pi\^Kj III LALL I Cvjljr I Kjkj A 1A1 I Av^^oLrLfO^VJALrC 1 vjljr 1 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTGTGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGACATACCGCCGGTCGGACCTCATAIUTIG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 176 


9_21C8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGTATGTATGAAACTGATTTGCTCGGGGGC 

AQAj 111 C ACC 1 CQjOLajQjA 1 A 1 1 ACCQjCjvjCjC A AOL 1 Lr A 1 

CAGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTCGAAGGATACCGCGAGCAAAAAGCGGGCAGTA 

CGCTAATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAG 

GGGGCAGACCTCTTATGGTGCAACGCCAGGACATCTGC 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGATCAGG 

GCGAAGTCTACGACATACCGCCGGTCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO: 177 


9_22B1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATAAGGCACCGCATCCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGl 1 lCACCTCOurGGArATi 1 ACCQOCjGCAAGCTCKjTC 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACTTACCGCCGACCGGACCCCATATTTTG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO: 178 


9_23A10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 
TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 
ACGCTTCACCTCGGTGGATATTACCGGGGCAAGCTGGT 
r a nr a TTnrrTrrTTTP a tp a AnrrnA apa a n a hpt 

LAULA X 1 1 1 111 1 v^AAuL-L.VJ/\/\L,/\ 1 L^L>/\VJ/\ljrv^ 1 

TGAGGGCCAAAaACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGCGGGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAATGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGACATACCGCCGGTCGGACCTCATATriT 

GATGTATAAGAAATTGACGTAA 


SEQID 


9_24F6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
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NO: 179 




TGAGATCAGGCACCGCATTCTCAGGCCGAATCAGCCGC 

TAGAAGCATGCAAGTATGAAACCGATTTGCTCAGGGGT 

GCG n 1 CACC 1 CUCj 1 (j(jA 1 A 1 1 ACCOCjOvjCAAQjC I OA 1 (_ 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGCGCTTCTTCGGAAAAAAGG 

CGCGGACCTTTTGTGGTGCAACGCCAGGACGTCTGCGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGACCGGACCCCATA'ITIT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 180 


9_4H10 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACTGATTTGCTAGGGGGT 

ACGCJITCACCTCGGTGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCCTCCT1TCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GCGCGGACCTTATATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGGTCGGACCTCATA1TTT 

GATGTATAAGAAATTGACATAA 


SEQID 
NO:181 


9_4H8 


ATGATTGAAGTCAAACCAATAAATGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGAGGC 

ACGTTTCACCTAGGTGGATATTACCGGGGCAAGCTGAT 

CAGCATCGCTTCCT1TAATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACACTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACC1T1TATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGGTCGGACCTCATAT1T1' 

GATGTATAAGAAATTGACATAA 


SEQID 
NO: 182 


9_8H1 


ATGATTGAAGTCAAACCAATAACCGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCT 

AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGAACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGACCGGACCCCATATTTT 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 183 


9_9H7 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATGCGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGAGC 



-157- 



WO 02/36782 



PCT/US01/46227 







AGCATCGCCTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGAGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCGGACCTnTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCTGTCGGACCTCATATl'ri' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:184 


9C6 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 

TGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAATGCCAGGACATCTGTG 

AGAGGCTAGTATGAAAAGCTCGGCTTCAGCGAACAAGG 

CGGGGTCTACGATATACCGCCGATCGGACCTCATA'ri-rr 

GATGTATAAGAAATTGGCGTAA 


SEQID 
NO: 185 


9H11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGT 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGT 
a rnTTTP a r^r^vr^nru^nn ata tt a {T*ananc* a a nrrn a t 

CAGCATCGCTTCCTTTCATAAAGCCGAACATTCAGAGCT 

TGAGGGCGAAGAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGGATACCGTGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAG 

GGGGCAGACCTTTTATGGTGCAATGCCAGGACATCTGT 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCGATCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO: 186 


0_4B10 


ATGATAGAAGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGA 

TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 
/-i a tttp a ptt a nnpnnr i * i w i * i 1 apa nnnnp a a a pTn a ttt 

v_,/\ 111 LAU 1 1 rVvJVjrv^Vjrvj^ 1111 ALAuuuuLAA/\L 1 KjA 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCG 
AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 
TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGACTCT 
AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 
caaAC atgctttogtopa ATnrnrnfi ac a accgcctc a 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 
GATATTTGATACGCCGCCAGTAGGACCTCACATCCTGAT 
GTATAAAAGGCTCACATAA 


SEQID 
NO: 187 


0_5B11 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTATGAAAGCGATITACTTCGTGGTG 
CATTTCACTTAGGCGGCnTTTACGGGGGCAAACTGATTT 
CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCG- 
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AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGACTCT 

AATTAAACACGCTGAACAACTTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAAGATCACA 


SEQID 
NO: 188 


0_5B3 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTATGAAACCGATTTACTTCGTGGTG 
o a tttp a ptt a nftrnnp' ititapa nnnar* a a a a ttt 

\^J\ 111 v_,/\l_, 1 1 AuuL/UUL 1111 nLnuUuULnnnL 1 vjr/V 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGGCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAACAACTTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGATCACA 


SEQID 
NO: 189 


0_5B4 


ATGCTAGAGGTGAAACTGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGT 
TAGAAGCGTGTATGTATGAAACCGATTTACTTCGTGGTG 

r* A T v T r Vf~* A r^TT A rirJf^f^r^/^TTTT ATA f~ini~±Cir* AAA /Tfl A TTT 
l^/V 111 LAL 1 1 AajOI^LjV-JL- 1111 ALAUuuuLAAAL 1 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCG 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGT1TTCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGAACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGATCACA 


SEQID 
NO: 190 


0_5B8 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 
o a tttp a ptt a cinc^aac* rm * apa ncu~inr* a a a rrn a ttt 

^ A 111 1 1 AOVJl^vjlJt^ 1111 ALAuuuuLAAAL 1 VJ/\ 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTOATAPOPPOCV 1 AOTAOOAPr , TPAr'ATPr^OATO 

TATAAAAGGCTCACA 


SEQID 
NO: 191 


0_5C4 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 
TAGAAGCGTGTATGTATGAAACCGATTTACTTCGTGGTG 
CATTTCACTTAGGCGGCTTTTACAGGGGCAAACTGATTT 
CCATAGCTTCATTCCACCAGGCCGAGCACTCAGGCCTCC 
AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 
TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTAT 
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AATTAAACACGCTGAAGAAATTCTTCGTAAGAAGGGGG 
CGGACTTGCTTTGGTGCAATGCGCGGACGTCCGCCTCAG 
GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 
ATATTTGACACGCCGCCAGTAGGACCTCACATCCTGATG 
TATAAAAGGATCACA 


SEQID 
NO: 192 


0_5D11 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 

TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

P A TTTr* A CTT A GGPGGPTTTT A C A CiCiCiCIC A A A CTG A TTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGACTCT 

AATTAGACACGCTGAACAACTTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAGGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGCTCACA 


SEQID 
NO: 193 


0_5D3 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 
a TTnrr* 1 a ptt a aru^nnr^r a tt ac*a CrCinnr 1 a a a r^TO athtt 

111 1 1 /VOvjrv^OOv-. 1/Vl I Av>AUUUULn/\AL 1 \Jfi\ 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGATCACATAA 


SEQID 
NO: 194 


0_5D7 


atgatagaagtgaaaccgattaacgcagaggagaccta 
tgaactaaggcatagaatactcagaccaaaccagccga 
tagaagcgtgtatgtatgaaaccgatttacttcgtggtg 

r* a tttp a ptt a nnpnnrTTTT apa fWrftP a a a pth attt 

O/V 111 1 1 /YvJvjA^OVJA^. 1111 ALAuUUUV^A/yiL 1 \JJ\ 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTC 

GAAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTAC 

CTTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTC 

TAATTAGACACGCTGAACAACTTCTTCGTAAGAAGGGG 

GCGAATATGCTTTGGTGTAATGCGCGGACAACCGCCTC 

AGGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAG 

AGATATTTGATACGCCGCCAGTAGGACCTCACATCCTG 1 

ATGTATAAAAGGATCACA 


OJJiV^ XL/ 

NO: 195 




ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 

TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

CACTTCACTTAGGCGGCTTTTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGT1T1CGTGATCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 
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GCTACTACAAAAAGTTAGX3CTTCAGCGAGCAGGGAAAG 
GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 
TATAAAAGGATCACA 


SEQID 
NO: 196 


0_6D10 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 

TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CATTTCACTTAGGCGGCTTTTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGCTCACA 


SEQID 
NO: 197 


0_6D11 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

ACGCTTCACCTCGGTGGATATTACCGGGGCAAGCTGGT 

CAGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGGGTACCGTGAGCAAAAAGCGGGCAGTAC 

GCTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGG 

GGGCAGACCTTTTATGGTGCAACGCCAGGACATCTGCG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGGTCGGACCTCATATITI' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO: 198 


0_6F2 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 

TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

CATTTCACTTAGGCGGCTATTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTTTCGTGAGCAGAAAGCGGGATCGACTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GATATTTGATACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGATCACA 


SEQID 
NO: 199 


0_6H9 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGA 

1 AGAACjCCjIOI AICjI A I Cj AAACAAj A 1 1 1AL1 lCLilLnjrlCj 

CATTTCACITAGGCGGClTTTACGGGGGCAAACrG 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCG 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGACTCT 

AATTAGACACGCTGAAGAAATTCTTCGTAAGAAGGGGG 

CGAACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGACACGCCGCCAGTAGGACCTCACATCCTGATG 
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TATAAAAGGCTCACA 


SEQID 
NO:200 


10_4C10 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 
c a tttp a <tt a rwir^nncTKm' Ari anaac a a a era a ttt 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACNTGC'rriGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GATATTTGATACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGCTCACATAA 


SEQID 
NO:201 


10_4D5 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

111 V_/\V_ 1 1 /\OOL^OLjI— 1111 /\LAUUuUL/\./\/\U 1 Vjr/\ 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGACTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACTrGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGATCACATAA 


SEQID 
NO:202 


10_4F2 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTTTGAAAGCGATTTACTTCGTGGTG 

L>A 111 LAL 1 1 AvjLj^wO^ 1111 ALAuuvjuVyAAAL 1 \Ji\ 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGTAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGCTCACATAA 


SEQID 
NO:203 


10_4F9 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

L,A 111 L-AL 1 1 AvjLjL^OLjv^ 1111 ALAuUuuLAAAL 1 uA 111 

CC AT AGCTTCATTCCACCAGGCCGAGCACTC AG A ACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTTTCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGCTCACATAA 


SEQID 


10_4G5 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 



- 162- 



WO 02/36782 



PCT/US01/46227 



NO:204 




TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTTTGAAAGCGATTTACTTCGTGGTG 
n ATTrr'Ar'TTAorir'orir'TATTAPAfiofTriPA a AnTrATTT 

V^/V JL A X * r\\ JL JL AYvJ v_r^- vJvJTV-^ A t\ J i. /ta^^yvjvjvj VJ^/vrvrw^ l \Jr\. ill 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTACCGCGATCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGCTCACATAA 


SEQID 
NO:205 


10_4H4 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

\^J\ 111 l_/\V_, 1 1 AOVJi^.VJkJV-' 1111 ALAUUUULAAAV/ 1 VJ/\ 111 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGATCACATAA 


SEQID 
NO:206 


11_3A11 


ATGATAGAAGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTGAGGCATAAAATACTCAGACCAAACCAGCCGA 
TAGAAGTGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

l^/V 111 l^/\V^ 1 1 /VOLJV-.\JvJi^ 1111 ALAuuuVJLAAaL 1 KJx\ 111 

CCATAGCGTCATTCCACCAGGCCGAGCACCCAGACCTC 

CAAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTAC 

CTTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTC 

TAATTAAACACGCTGAACAAATTCTTCGTAAGAGGGGG 

GCGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGCTCACATAA 


SEQID 
NO:207 


11_3B1 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTGAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTTTGAAACCGATTTACTTCGTGGTG 

L^A. Ill L^/VLx 1 1 /\OOrL^vjrvjL^ 1111 ALnUUUuLAnAl/ 1 \Jf\ ill 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAACTCCGAGGTATGGCTACC 

TTGGAAGGTTTTCGTGAGCAGAAAGCGGGATCGACTCT 

AATTAGACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAGGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGACACGCCGCCAGTAGGGCCTCACATCCTGATG 

TATAAAAGGCTCACATAA 


SEQID 
NO:208 


11_3B5 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTTTGAAAGCGATTTACTTCGTGGTG 
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CCATAGCGTCATTCCACCAGGCCGAGCACTCGGAACTC 

CAAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTAC 

CTTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTC 

TAATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGG 

GCGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTC 

AGGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAG 

AGGTATTTGATACGCCGCCAGTAGGACCTCACATCCTG 

ATGTATAAAAGGATCACATAA 


SEQID 
NO:209 


11_3C12 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

A TTTP A PTTY»r*PrPrvOPTT , TT A PrTfrOnfYP AAA PTfr ATTT 

CCATAGCGTCATTCCACCAGGCCGAGCACCCAGACCTC 

CAAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTAC 

CTTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTC 

TAATTAGACACGCTGAACAACTTCTTCGTAAGAGGGGG 

GCGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GATATTCGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGATCACATAA 


SEQID 
NO:210 


11_3C3 


ATGATAGAAGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

CCATAGCGTCATTCCACCAGGCCGAGCACTCAGAACTC 

CAAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTAC 

CTTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTC 

TAATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGG 

GCGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGACACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGATCACATAA i 


SEQID 
NO:211 


11_3C6 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTTTGAAAGCGATTTACTTCGTGGTG 

f ATi-TY" 1 AC^^ A^C^CU^C^VTT^T AC'CiCiCiCiCTC' A A ACTCtATTT 
L.A1 X Xv-.x\v^X X r\VJ*J\_AJVJV^ X X X X /\V_AJVj*jvJVJ^x\/Vf\\-^ AVjrVX X X 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCG 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGACTCT 

AATTAGACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGATCACATAA 


SEQID 
NO:212 


11_3D6 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 
CATTTCACTTAGGCGGClTTTACAGGGGrcAAACTGATTT 
CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 
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AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGCTCACATAA 


SEQDD 
NO:213 


1_1G12 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 
r 1 a TTTr a nT a nncnnc^x^rw a c^ananac^ a a apt(tATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGCTCACATAA 


SEQID 
NO:214 


1_1H1 


ATGATAGAAGTGAAACCTATTAACGCAGAGGAGACTTA 
CGAACTTCGACACAAGATCCTGCGCCCTAATCAGCCGA 
TAGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGC 

1 V^vJ 1 1 ^.\-»/\ 111 vJvJvJv^vJvJvJ 1 IV^l jc\ 1 v^vJ JL VJVJV-'V^rVrV J. JL vJ.rv a 

TCGATTGCGAGTTTCCACAAAGCTGAACACTCAGAACT 

GCAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCG 

ACCCTCGAAGGATTCCGTGAGCAGAAGGCTGGCTCTTC 

GCTTATTAGGCACGCCGAGGAGATACTACGGAATAAAG 

GGGCAGATCTGCTTTGGTGTAATGCACGCACGACAGCC 

TCCGGTTACTATAAAAGGCTTGGll'I'lAGTGAGCACGGC 

GAAGTTTTCGAAACCCCGCCGGTTGGGCCGCACATTCTT 

ATGTACAAAAGAATCACT 


SEQID 
NO:215 


1_1H2 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 

CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGTT 

AGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGCT 

VvVJT 1 1 v^L-A 111 VJUVJL'VJvJvJ 1 1 v- 1 r\ 1 v^vJ 1 ^JVJv-'.tt-rvrV X A v_J.r\ iv^i 

CGATTGCGAGTTTCCACCAAGCTGAACACTCAGAACTG 

GAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGA 

CCCTCGAAGGATTCCGTGAGCAGAAGGCTGGCTCTTCG 

CTTATTAGGCACGCCGAGGAGATACTACGGAAAAGAGG 

GGCAGATCTGCTTTGGTGTAATGCACGCACGACAGCCG 

CCGGTTACTATAAAAAGCTTGGTTTTAGTGAGCAGGGC 

GAAATTTTCGACACCCCGCCGGTTGGGCCGCACATTCTT 

ATGTACAAAAGAATCACT 


SEQID 
NO:216 


1_1H5 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 

CGAAATTCGACACAGGATCCTGCGCCCTAATCAGCCGT 

TAGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGC 

TCGTTCCATTTGGGCGGGTTCTATCGTGGCAAATTGATC 

TCGATTGCGAGTTTCCACCAAGCTGAACACTCAGACCTG 

GAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGA 

CCCTCGAAGGATACCGTGATCAGAAGGCTGGCTCTTCG 
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CTTATTAGGCACGCCGAGCAGATACTACGGAAAAGAGG 

GGCAGATCTGCTTTGGTGCAATGCACGCACGACAGCCG 

CCGGTTACTATAAAAGGCTTGGTTTTAGTGAGCAGGGC 

GAAGTTTTCGACACCCCGCCGGTTGGGCCGCACATTCTT 

ATGTACAAAAAACTCACT 


SEQID 
NO:217 


1_2A12 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 

CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGA 

TAGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGC 

TCGTTCCATTTGGGCGGGTTCTATCGTGGCAAATTGATC 

TCGATTGCGAGTTTCCACCAAGCTGAACAGTCAGAACT 

GGAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCG 

ACCCTCGAAGGATACCGTGATCAGAAGGCTGGCTCTAC 

GCTTATTAAGCACGCCGAGGAGATACTACGGAAAAAAG 

GGGCAGATCTGCTTTGGTGCAATGCACGCACGTCAGCC 

GCCGGTTACTATAAAAGGCTTGGTTTTAGTGAGCAGGG 

CGAAATTTTCGACACCCCGCCGGTTGGGCCGCACATTCT 

TATGTACAAAAGACTCACT 


SEQED 
NO:218 


1_2B6 


ATGATAGAAGTGAAACCTATTAACGCAGAGGAGACTTA 

CGAACTTCGACACAAGATCCTGCGCCCTAATCAGCCGTT 

AGAGGCATGCATGTATGAAACCGATCTGCTGCGGGGCT 

CGTTCCATTTGGGCGGGTTCTATCGTGGCAAATTGATCT 

CGATTGCGAGTTTCCACCAAGCTGAACACTCAGAACTG 

GAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGA 

CCCTCGAAGGATTCCGTGATCAGAAGGCTGGCTCTTCGC 

TTATTAAGCACGCCGAGGAGATACTACGGAAAAGAGGG 

GCAGATCTGCTTTGGTGCAATGCACGCACGTCAGCCTCC 

GGTTACTATAAAAAGCTTGG1T1TAGTGAGCAGGGCGA 

AATITTCGAAACCCCGCCGGTTGGGCCGCACATTCTTAT 

GTACAAAAGACTCACT 


SEQID 
NO:219 


1_2C4 


ATGCTAGAAGTGAAACCTATTAACGCAGAGGAGACTTA 

CGAACTTCGACACAAGATCCTGCGCCCTAATCAGCCGA 

TAGAGGCATGCATGTATGAAACCGATCTGCTGCGGGGC 

TCGTTCCATTTGGGCGGGTTCTATCGTGGCCAATTGATC 

TCGATTGCGAGTTTCCACCAAGCTGAACACTCAGACCTG 

CAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGAC 

CCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTACGC 

TTATTAAGCACGCCGAGGAGCTACTACGGAAAAAAGGG 

GCAGATCTGCTTTGGTGCAATGCACGCACGACAGCCGC 

CGGTTACTATAAAAAGCTTGGlTiTAGTGAGCAGGGCG 

AAGTTTTCGACACCCCGCCGGTTGGGCCGCACATTCTTA 

TGTACAAAAAAATCACT 


SEQID 
NO:220 


1_2D2 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 

CGAACTTCGACACAAGATCCTGCGCCCTAATCAGCCGTT 

AGAGGCATGCATGTATGAAAGCGATCTGCTGCGGAGCG 

CATTCCATTTGGGCGGGTTCTATCGTGGCAAATTGATCT 

CGATTGCGAGTTTCCACAAAGCTGAACACTCAGAACTG 

CAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGAC 

CCTCGAAGGATACCGTGATCAGAAGGCTGGCTCTTCGC 

TTATTAGGCACGCCGAGGAGATACTACGGAAAAGAGGG 

GCAGATATGC1T1GGTGCAATGCACGCACGTCAGCCGC 
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CGGTTACTATAAAAGGCTTGGTTTTAGTGAGCAGGGCG 

AAGTTTTCGACACCCCGCCGGTTGGGCCGCACATTCTTA 

TGTACAAAAGAATCACTTAA 


SEQID 
NO:221 


1_2D4 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 
CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGA 
TAGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGC 

npr^ HTTP C A TIM K C\CXCi rnnnTTPT A TC^CVmCU^ AAA TTfl A TC 1 

TCGATTGCGAGTTTCCACCAAGCTGAACACTCAGACCTG 

CAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGAC 

CCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTTCGC 

TTATTAAGCACGCCGAGCAGCTACTACGGAAAAAAGGG 

GCAGATATGCTTTGGTGTAATGCACGCACGTCAGCCGC 

CGGTTACTATAAAAGGCTTGG'llllAGTGAGCACGGCG 

AAAll'rrCGAAACCCCGCCGGTTGGGCCGCACATTCTTA 

TGTACAAAAGAATCACT 


SEQID 
NO:222 


1_2F8 


ATGCTAGAAGTGAAACCTATTAACGCAGAGGATACTTA 

CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGTT 

AGAGGCATGCATGTATGAAACCGATCTGCTGCGGGGCT 

i^vj 1 1 L^lw/\ 111 OOtOI_,1_jOLj 1 1 v>x 1 1\ 1 1 vjvj^/V^vA. 1 1 vjr/\ IL^l 

CGATTGCGAGTTTCCACCAAGCTGAACATTCAGAACTG 

GAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGA 

CTCTCGAAGGATACCGTGATCAGAAGGCTGGCTCTTCG 

CTTATTAGGCACGCCGAGGAGATACTACGGAAAAGAGG 

GGCAGATATGCTTTGGTGCAATGCACGCACGACAGCCG 

CCGGTTACTATAAAAAGCTTGG1T1TAGTGAGCAGGGC 

GAAATITACGACACCCCGCCGGTTGGGCCGCACATTCTT 

ATGTACAAAAAACTCACT 


SEQID 
NO:223 


1_2H8 


ATGATAGAAGTGAAACCTATTAACGCAGAGGAGACTTA 
CGAACTTCGACACAAGATCCTGCGCCCTAATCAGCCGTT 
AGAGGCATGCATGTATGAAACCGATCTGCTGCGGGGCG 

V^VJ 1 1 V^Vx A. Ill VjOOv^vJrVjrvj 1 1 1 /V 1 \^>\J 1 00^/Vr\/\ 1 1 vXr\. Iv^l 

CGATTGCGAGTTTCCACCAAGCTGACCACTCAGAACTG 

CAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGAC 

CCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTACGC 

TTATTAGGCACGCCGAGCAGATACTACGGAAAAGAGGG 

GCAGATCTACTTTGGTGCAATGCACGCACGTCAGCCGC 

CGGTTACTATAAAAAGCTTGGTTTTAGTGAGCACGGCG 

AAA'1'llTCGAAACCCCGCCGGTTGGGCCGCACATTCTTA 

TGTACAAAAGACTCACTTAA 


SEQID 
NO:224 


1_3A2 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 

CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGA 

TAG AfrfiP ATGP ATOTATOA A AfiPfi ATCTGCTGPGGGGC 

GCGTTCCATTTGGGCGGGTTCTATCGTGGCAAATTGATC 

TCGATTGCGAGTTTCCACCAAGCTGAACACTCAGACCTG 

CAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGAC 

CCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTTCGC 

TTATTAGGCACGCCGAGGAGATACTACGGAAAAAAGGG 

GCAGATATGCTTTGGTGCAATGCACGCACGACAGCCGC 

CGGTTACTATAAAAGGCTTGGTlTiAGTGAGCAGGGCG 

AAGTTTTCGACACCCCGCCGGTTGGGCCGCACATTCTTA 
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TGTACAAAAGAATCACT 


SEQID 
NO:225 


1_3D6 


ATGATAGAGGTGAAACCGA1TAACGCAGAGGATACCTA 

TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 

TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CATTTCACrTAGGCGGCrTTTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACTTGC111GGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGCTCACATAA 


SEQID 
NO:226 


1_3F3 


ATGATAGAAGTGAAACCTATTAACGCAGAGGAGACTTA 
CGAACTTCGACAGAGGATCCTGCGCCCTAATCAGCCGA 
TAGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGC 
tcottcc ATTTf^ncoooTTCTATCOTOocc' a atto att 

X V»»VJ X X V/V-'ii XXX VJVJ VJ VrfVJVJ VJ J. X X X V^V_I X VJvJV^\^/l/i X X VJxv X V— ' 

TCGATTGCGAGTTTCCACCAAGCTGAACACTCAGAACT 

GCAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCG 

ACCCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTAC 

GCTTATTAAGCACGCCGAGGAGATACTACGGAAAAAAG 

GGGCAGATCTGCTTTGGTGCAATGCACGCACGTCAGCC 

GCCGGTTACTATAAAAGGCTTGGxTxTAGTGAGCACGG 

CGAAA1T1TCGACACCCCGCCGGTTGGGCCGCACATTCT 

TATGTACAAAAGAATCACT 


SEQID 
NO:227 


1_3H2 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 
CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGA 
TAGAGGCATGCATGTATGAAACCGATCTGCTGCGGGGC 
(^01^^ C ATTTftTrnCOriOT A CT ATCGTOfiPr' A ATTCr ATC 

VJ V^VJ X X VwV^A XXX VJ \J VJV_>V_I VJ VJ X ixVy X -tl. X V^VJ X VJ\J\>^>/\/\ X X vJ^V X 

TCGATTGCGAGTTTCCACAAAGCTGAACACTCAGAACT 

GCAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCG 

ACCCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTAC 

GCTTATTAAGCACGCCGAGCAGCTACTACGGGAAAAAG 

GGGCAGATATGCTTTGGTGCAATGCACGCACGTCAGCC 

GCCGGTTACTATAAAAGGCTTGGTiTlAGTGAGCAGGG 

CGAAG1T1TCGACACCCCGCCGGTTGGGCCGCACATTCT 

TATGTACAAAAAACTCACT 


SEQID 
NO:228 


1_4C5 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 
CGAACTTCGACACAAGATCCTGCGCCCTAATCAGCCGA 
TAGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGC 
Tcmrcc ATTTr^opriorrTTr'T ATPriTfinp a a attoatt 

IVyUl X X X X VJuUV^VJVJu Xl^l rt. x I— \J X vjvJV^ /VrvrV X X vJr\. X 

TCGATTGCGAGTTTCCACAAAGCTGAACACTCAGACCT 

GGAAGGGCAAAACCAGTATCAATTACGAGGGATGGCG 

ACCCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTAC 

GCTTATTAGGCACGCCGAGGAGATACTACGGAAAAGAG 

GGGCAGATATGCT1TGGTGCAATGCACGCACGTCAGCC 

TCCGGTTACTATAAAAGGCxTGG'lTlTAGTGAGCACGGC 

GAAA1T1TCGACACCCCGCCGGTTGGGCCGCACATTCTT 

ATGTACAAAAGACTCACTTAA 


SEQID 


1_4D6 


ATGCTAGAAGTGAAACCTATTAACGCAGAGGATACTTA 
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CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGA 
TAGAGGCATGCATGTATGAAACCGATCTGCTGCGGGGC 

TCGATTGCGAGTTTCCACAAAGCTGAACACTCAGACCT 

GGAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCG 

ACCCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTAC 

GCTTATTAGGCACGCCGAGCAGATACTACGGAAAAGAG 

GGGCAGATATGCTCTGGTGCAATGCACGCACGTCAGCC 

GCCGGTTACTATAAAAGGCTTGGTTTTAGTGAGCAGGG 

CGAAG1T1TCGAAACCCCGCCGGTTGGGCCGCACATTCT 

TATGTACAAAAGACTCACT 


SEQED 
NO:230 


1_4H1 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATAC1TA 

CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGTT 

AGAGGCATGCATGTATGAAACCGATCTGCTGCGGGGCT 

lw.vjr 1 1 v^V^/\ 111 OOOv_>kjrVjrO 1 1 1 /VI K^kj 1 UuLAAA 1 1 U/\ 1^1 

CGATTGCGAGTTTCCACCAAGCTGAACACTCAGACCTG 

CAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGAC 

CCTCGAAGGATACCGTGAGCAGAAGGCTGGCTCTACGC 

TTATTAGGCACGCCGAGCAGCTACTACGGAAAAGAGGG 

GCAGATCTGCTTTGGTGCAATGCACGCACGTCAGCCTCC 

GGTTACTATAAAAGGCTTGGllllAGTGAGCACGGCGA 

AGTTTTCGACACCCCGCCGGTTGGGCCGCACATTCTTAT 

GTACAAAAGACTCACT 


SEQID 
NO:231 


1_5H5 


ATGCTAGAAGTGAAACCTATTAACGCAGAGGAGACTTA 

CGAACTTCGACACAAGATCCTGCGCCCTAATCAGCCGIT 

AGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGCT 

V^Kj 1 1 l^V-*A 111 LjvjO^OLjvj 1 1 /\ 1 v_-vJ 1 UuLUnA 1 1 v_r/\ 1 1 

CGATTGCGAGTTTCCACCAAGCTGAACACTCAGAACTG 

GAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGA 

CCCTCGAAGGATTCCGTGAGCAGAAGGCTGGCTCTACG 

CTTATTAAGCACGCCGAGCAGATACTACGGAAAAGAGG 

GGCAGATATGCTTTGGTGCAATGCACGCACGTCAGCCG 

CCGGTTACTATAAAAAGCTTGGTTTTAGTGAGCACGGC 

GAAAT1T1CGACACCCCGCCGGTTGGGCCGCACATTCTT 

ATGTACAAAAAACTCACTTAA 


SEQID 
NO:232 


1_6F12 


ATGATAGAAGTGAAACCTATTAACGCAGAGGAGACTTA 

CGAACTTCGACACAGGATCCTGCGCCCTAATCAGCCGA 

TAGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGC 

1 ULj 1 1 CL, A 111 LfLtvjCLxLtvj 11L1AI kAj 1 UuLAAA 1 1 vj£\ 1 

TCGATTGCGAGTTTCCACCAAGCTGAACACTCAGACCTA 

GAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGA 

rr(^GGAAGGATACCGTGATrAGAAGGCTGGCTCTACG 

CTTATTAAGCACGCCGAGGAGCTACTACGGAAAAGAGG 

GGCAGATATGCTTTGGTGCAATGCACGCACGTCAGCCG 

CCGGTTACTATAAAAGGCTTGG1T1TAGTGAGCACGGC 

GAAATTTACGAAACCCCGCCGGTTGGGCCGCACATTCTT 

ATGTACAAAAAAATCACT 


SEQID 
NO:233 


1_6H6 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACTTA 
CGAACTTCGACACAAGATCCTGCGCCCTAATCAGCCGA 
TAGAGGCATGCATGTATGAAAGCGATCTGCTGCGGGGC 
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TCGATTGCGAGTTTCCACCAAGCTGAACACTCAGACCTG 

GAAGGGCAAAAGCAGTATCAATTACGAGGGATGGCGA 

CCCTCGAAGGATACCGTGATCAGAAGGCTGGCTCTTCG 

CTTATTAAGCACGCCGAGGAGATACTACGGAAAAGAGG 

GGCAGATCTGCTTTGGTGCAATGCACGCACGTCAGCCG 

CCGGTTACTATAAAAGGCTrGG'lll'lAGTGAGCAGGGC 

GAAA'I-l'rrCGACACCCCGCCGGTTGGGCCGCACATTCTT 

ATGTACAAAAAAATCACT 


SEQID 
NO:234 


3_11A10 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

V r\ XXX V .rYv^ I X AUVJCUUV^ X T\ X X r\^AVJUVJ\J^r\A/\V X VJrt XXX 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AGTTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGATCACATAA [ 


SEQID 
NO:235 


3_14F6 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

ATTTr A f^TT A nCPnOP' 1 TTT APA CXCICXCXC* AAA CTCi A TTT 
111 UAVy 1 1 /\vJvJV^Ovjl^ 1111 ALAUUUULAAAL 1 VJJ\ 111 

CCATAGCTTCATTCCACGAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACGTCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGCTCACATAA 


SEQID 
NO:236 


3_15B2 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

C A TTTC^ A r"TT A fWYVWT A TT A C 1 aCiCiCiCiC* A A AfTfrATTT 

V_*r\ XXX v^-fW^ X X rtvjvj l_^vJVJV^ 1 f\ X X A\^kJVJV_J VJ AV^Vrt>V^ X VXrV. XXX 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

r'OfrACTTnrTTTGOTriTAATnCQC'GCTACATr'PGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGATCACATAA 


SEQID 
NO:237 


3_6A10 


ATGATAGAAGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 
CATTTCACTTAGGCGGCTATTACAGGGGCAAACTGATTT 
CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 
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AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGATCACATAA 


SEQID 

NO:238 


3_6B1 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 

TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

CATTTCACTTAGGCGGCTATTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACCCAGAACTC 

CAAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTAC 

CTTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTC 

TAATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGG 

GCGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGATCACATAA 


SEQID 
NO:239 


3_7F9 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 

TAGAAGCGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

CAl I lCACTTAGGCGGCTATTACGGGGGCAAACTGAl l 1 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGATCACATAA 


SEQID 
NO:240 


3_8G11 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAGAATACTCAGACCCAACCAGCCGA 

TAGAAGTGTGTATGTATGAAAGCGATTTACTTCGTGGTG 

CATTTCACn^AGGCG<iCTATTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ata tttv: AAA pnpr^nrT 1 a nnp a nr~i a /"v^t/- 1 a /-* a nr/^/^m A t 
A 1 A 1 1 1 uAAALuLLULLAu 1 AljljAv_A_* 1 LALA 1 Ct- 1 vjA 1 

GTATAAAAGGATCACATAA 


SEQID 
NO-.241 


4_1B10 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 
CATTTCACTTAGGCGGCTTTTACGGGGGCAAACTGATTT 
CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 
AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 
TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 
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AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGATCACATAA 


SEQID 
NO:242 


5_2B3 


ATGATAGAAGTGAAACCTATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGTAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGATCACATAA 


SEQID 
NO:243 


5_2D9 


ATGCTAGANGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGN 

TAGAAGTGTGTATGTATGAAANCGATTTACTTCGTGGTG 

CATTTC ACTTAGGCGGC 1111 AC AGGGGCAAAC 1 G Al 1 1 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAACAAATTCTTCGTGAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGACACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGCTCACATAA 


SEQID 
NO:244 


5_2F10 


ATGCTAGAAGTGAAACCTATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGA 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

/"> Amy a fir Arrfrccrr t* r /^r^nnanac* a a AnmATrr 
CA1 1 1CAC1 1AGGCGGC1 1 1 lALuWjuutAAAtluAl 1 1 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGCTCACATAA 


NO:245 


fs. 1 A 1 1 
O 1A1 1 


ATnrTAnAnATnA a a<~ty"iatta AcncAaAaaATAOc^TA 

TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 

TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CATTTCACTTAGGCGGCTTTTACAGGGGCAAACTGATTT 

CCATAGCGTCATTCCACCAGGCCGAGCACTCAGACCTC 

CAAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTAC 

CTTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTC 

TAATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGG 

GCGGACATGCilTGGTGCAATGCGCGGACATCCGCCTC 
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AGGCTACTACAGAAAGTTAGGCTTCAGCGAGCAGGGAG 

AGGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTG 

ATGTATAAAAGGCTCACATAA 


SEQID 
NO:246 


6_1D5 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 

TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CATTTCACirAGGCGGCTTTTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGGGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGATCACATAA 


SEQID 
NO:247 


6_1F11 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 

TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CATTTCACTTAGGCGGCTTTTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGCTCACATAA 


SEQID 
NO:248 


6_1F1 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 

TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CATTTCACTTAGGCGGCTTTTACAGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGCTCACATAA 


SEQID 
NO:249 


6_1H10 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 

TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 

TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CATTTCACTTAGGCGGCTTTTACGGGGGCAAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCGGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACATGCTTTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGACACGCCGCCAGTAGGACCTCACATCCTGAT 
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GTATAAAAAGATCACATAA 


SEQID 
NO:250 


6_1H4 


ATGCTAGAAGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGATCAGAAAGCGGGATCGACTCT 

AATTAAACACGCTGAACAAATTCTTCGTAAGAGGGGGG 

CGGACATGCT1TGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GGTATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGCTCACATAA 


SEQID 
NO:251 


8_1F8 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

r'ATTTr'AP'TTAnr^OrTr'TTT^AP AnfrfrfrPA A ArTfrATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGATCACATAA 


SEQID 
NO:252 


8_1G2 


ATGATAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAGAGTACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

r A TTTr' A fTT A nnC^dCiC^T A TT A C A CiCiCiCiC A A A PTfr A TTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGCAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGAGACGCCGCCAGTAGGACCTCACATCCTGAT 

GTATAAAAGGCTCACGTAA 


SEQID 
NO:253 


8_1G3 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACTTA 
CGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGA 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 
riTTTrArTTAfMTrrTr.rTATTAPAnrYr.nrA a APTrrATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

ATATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGATCACGTAA 


SEQID 


8_1H7 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
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TGAACTAAGGCATAGAATACTCAGACCAAACCAGCCGA 

TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 

C ATTTC A PTT AGGTnnCTTTTAC AGGGGC AAACTGATTT 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGAACTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAAACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACATGCnTGGTGCAATGCGCGGACATCCGCCTCA 

GGCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGA 

GATATTTGAAACGCCGCCAGTAGGACCTCACATCCTGA 

TGTATAAAAGGCTCACATAA 


SEQID 
NO:255 


8_1H9 


ATGCTAGAGGTGAAACCGATTAACGCAGAGGATACCTA 
TGAACTAAGGCATAAAATACTCAGACCAAACCAGCCGT 
TAGAAGTGTGTATGTATGAAACCGATTTACTTCGTGGTG 
C A TTTT 1 A r"TT A CldC GGTT A TT A C A G GGGP A A A PTfi A TTT 

V AY JL X X ' — -AY^-. X X AvJ^Jl V-JVJV — 1 AY JL X AYV—-AVVJ VV — X VJ Ai X x X 

CCATAGCTTCATTCCACCAGGCCGAGCACTCAGACCTCC 

AAGGCCAGAAACAGTACCAGCTCCGAGGTATGGCTACC 

TTGGAAGGTTATCGTGAGCAGAAAGCGGGATCGAGTCT 

AATTAGACACGCTGAAGAAATTCTTCGTAAGAGGGGGG 

CGGACTTGCTTTGGTGTAATGCGCGGACATCCGCCTCAG 

GCTACTACAAAAAGTTAGGCTTCAGCGAGCAGGGAGAG 

GTATTTGATACGCCGCCAGTAGGACCTCACATCCTGATG 

TATAAAAGGCTCACATAA 


SEQID 
NO:256 


GAT1_21F 
12 


ATGATTGAAGTCAAACCTATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

A CCVTVTC A PPTPGGGGG AT ATT ACCGGGGCA AGCTGAT 

CAGCATCGCTTCCTTTCATAATGCCGAACATTCAGAGCT 

TGAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCG 

ACGCTTGAAGGATACCGTGAGCAAAAAGCGGGAAGCA 

CGCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAA 

GGCGCGGACCTTTTATGGTGCAACGCCAGGACATCTGT 

GAGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGG 

GCGAAGTCTACGACATACCGCCGATCGGACCTCATATTT 

TGATGTATAAGAAATTGACGTAA 


SEQID 
NO:257 


GAT1_24G 
3 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

A PrVTTTP A Pr^noOTOG AT ATT APCGGGGCA AGCTGATC 

AGCATCGCCTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCrTTTATGGTGCAATGCCAGGACATTTGTGA 

GCGGTTACTATGAAAAGCTCGGTTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTTATA'iTlTG 

ATGTATTAGAAATTGACATAA 


SEQID 
NO:258 


GAT1_29G 
1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGT 



-175- 



WO 02/36782 



PCT7US01/46227 







A POTTTP A CCTCCIGTCICt A T A TT A CCGOGGC A A OPTO A TP 

AGCATCGCTTCCTTTCATCAAGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGTAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTGCGATATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGGCATAA 


SEQID 
NO:259 


GAT1_32G 
1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 
TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 
TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 
A POTTTP A PPTPOOTOO A T A TT A PPOOOOP A A OPTO A TP 

AGCATCGCITCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTACGACATACCGCCGATCGGACCTCATAITI'IG 

ATGTATAAGAAATTGACATAA 


SEQID 
NO:260 


GAT2_15G 
8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TGGAAGCATGCAAGTATGAAACCGATTTGCTCGGGGGC 

A POTTTP A PPTPOOTOO ATA TT A CCClClCiCIC' A A OPTO A TP 

AGCATCGCTTCCTTTCATAATGCCGAACATTCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CGCTTGAAGGGTACCGCGAGCAAAAAGCGGGAAGCAC 

GCTCATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAG 

GCGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTG 

AGCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAGGG 

CGAAGTCTACGACATACCGCCGATCGGACCTCATA'l'iri' 

GATGTATAAGAAATTGACGTAA 


SEQID 
NO:261 


GAT2_19H 
8 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATACTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

A POTTTP A PPTPOOTOO AT ATT ACCGGGGCAAGPTGATC 

AGCATCGCTTCCTTTCATCAAGCCGAACATCCAGAGCTT 

GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGGTACCGCGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAAGG 

CGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTGA 

GCGGCTACTATGAAAAGCTCGGCTTCAGCGAACAGGGC 

GAAGTCTGCGACATACCGCCGATCGGACCTCATATTTTG 

ATGTATAAGAAATTGACATAA 


SEQID 
NO:262 


GAT2_21F 
1 


ATGATTGAAGTCAAACCAATAAACGCGGAAGATACGTA 

TGAGATCAGGCACCGCATTCTCCGGCCGAATCAGCCGC 

TTGAAGCATGTATGTATGAAACCGATTTGCTCGGGGGC 

ACGTTTCACCTCGGTGGATATTACCGGGGCAAGCTGATC 

AGCATCGCTTCCIUIGATCAAGCCGAACATTCAGAGCTT 



-176- 



WO 02/36782 



PCT/US01/46227 







GAAGGCCAAAAACAGTATCAGCTGAGAGGGATGGCGA 

CACTTGAAGGATACCGTGAGCAAAAAGCGGGCAGTACG 

CTTATCCGCCATGCCGAAGAGCTTCTTCGGAAAAAGGG 

GGCAGACCTTTTATGGTGCAACGCCAGGACATCTGTGA 

GCGGGTACTATAAAAAGCTCGGCTTCAGCGAACAAGGC 

GGGGTCTACGATATACCGCCGATCGGACCTCATA'l'I'l'lG 

ATGTATAAGAAATTGACGTAA 


SEQID 
NO:263 


13_10F6 


MIEVKPINAEDTYEmHRILRPNQPIJEACKYETDLLRGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDIPPVGPHILMYKKLT 


SEQID 
NO:264 


13_12G6 


MffiVKPINAEDTYEIRHRILRPNQPLEACKYETDLLRGAFH 
LGGYYRGKLVSIASFHQAEHPELEGQRQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
KLGFSEQGEVYDIPPTGPHILMYKKLT 


SEQID 
NO-.265 


14_2A5 


MEVKPINAEDTYEIRHRILRPNQPI^ACKYETDIXGSTFEIL 
GGYYRGKLISIASFNQAEHPELEGQKQYQLRGMATLEGYR 
EQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKKL 
GFSEQGEVYDTPPVGPHUMYKKLT 


SEQID 
NO:266 


14_2C1 


M1EVKPINAEDTYEIRHRILRPNQPLEACKYETDI1JIGAFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
KLGFSEQGEVYDTPPTGPHELMYKKLT 


SEQID 
NO-.267 


14_2F11 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLRGAFH 
LGGYYRGBCLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEALLRKKGADLLWCNARTSASGYYK 
KLGFSEQGEVYDTPPAGPHILMYKKLT 


SEQID 
NO:268 


CHIMERA 


MffiVKPINAEDTYEIRHPJLRPNQPIJEACMYETDLLRGAFH 
LGGYYRGKIJLSIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCN ARTS AS G YYKK 
LGFSEQGEVYDTPPVGPHILMYKKLT 


SEQID 
NO:269 


10_12D7 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTLH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEELI.RKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDJPPTGPHIIMYKKLT 


SEQID 
NO:270 


10_15F4 


MIEVKPINAEDTYEIPJEIRILRPNQPLEACMYETDIJJRGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEE 
YMQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
KLGFSEQGGVYDIPPVGPHILMYKKLT 


SEQID 
NO:271 


10_17D1 


MffiVKPINAEDTYEIRHPJIJIPNQPIJEACKYETDLLGGTFH 
LGGYYRGKLISLASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCN ARTS AS GYYKK 
LGFSEQGEVYDTPPVGPHILMYKKLT 


SEQID 
NO:272 


10_17F6 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLVSIASFHQAEHSELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
KLGFSEQGEVYDIPPVGPHILMYKKLT 


SEQID 

NO:273 


10_18G9 


MIEVKPINAEDTYEIRHRILRPNQPI£ACKYETDLLGGTFH 
LGGYYRGKLVSIASFHQAEHSELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
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SEQID 
NO:274 


10_1H3 


MEVKPlNAEDTYEnaiPJLPJ'NQPLEACKYETDLLGGTFH 
LGGYYRGKLVSIASFHQAEHPELEGRKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
ITT OFSFOOFVYDIPPTGPHILMYKKT T 

JVA./VJX *■» 1 J *^£ v 1 1 ' V X JL/11 X X VJ1 A JJUL-/XVX X IViVl^ X 


SEQID 
NO:275 


10_20D10 


MIEVKPINAEDTYEnmRIUlPNQPIJEACMYF^IXCKjTLH 
LGGYYRGKOSIASFHQAEHPEIEGQKQYQOIGMATLEEY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
T frFSFOOOVYniPPVOPTTn MYldCT T 


SEQID 
NO:276 


10_23F2 


IVOEVKPmAEDTYEIRHPJLRPNQPIJEACMYElDLLGGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
id rvPSPoriPvvnTPPvrTPWiT mytckt t 

TX 1 i\jrj ^£ V X X-/ 1X1 V VJ X XXXJLaLVX X XvXVX^X 


SEQID 
NO:277 


10_2B8 


MffiVKPmAEDTYEmmn.RPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEELLRKKGADLX.WCNARTSASGYYKK 
t rip^jprvriPvvriTPPvrrPTTTT A/ividd t 


SEQID 
NO:278 


10_2C7 


MffiVKPINAEDTYEIRHRlIJlPNQPlEACKYETDLLRGAFH 

LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATUEGY 

REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 

T l^T7<3T70i^T7VVT 1 TPPVrrP'PITr fJTVTCTCI T 


SEQID 
NO:279 


10_3G5 


IVxTEVKPINAEDTYEIRHRI^ 

LGGYYRGKLVSIASraQAEHPElJEGQKQYQLRGl^ 
Yl^QKAGSTLIRHA^ 

TTT OFWn(TRVYn IPPTCrPTTTT MYKKT T 
in. i a i r 1 oxj/v^ vjrj_!/ v x xjutst x vj jt xxxx ^ivx x j\ i\ x 


SEQID 
NO:280 


10_4H7 


MIEVKPINAEDTYEmHRILRPNQPLEACM 

LGGYYRGIsXVSIASraQAEHPELEGQKQYQLRGMATI^ 

YxvEQKAGSxTJOmAEEL^ 

JSJ^( vjrJr o xl V I xJXxr r 1 Urill l-^ivx x JVXVJ / X 


SEQID 
NO:281 


10_6D11 


IV1IEVKPINAEDTYE 

LGGYYRGKl.VSIASmQAEI^ELEGQKQYQUlGMATLEG 
YREQKAGSTLIRHAEELLR 

iv i a rr o yjfi-j v x i/irr v vjx xxjul<ivx x xvrvi * x 


SEQID 
NO:282 


10_8C6 


MlEVKPINAl^ 

LGGYYRGxsXISIASraQAEExPEI^GQKQYQLRGIVx^ 
REQKAGSTLIRHAEEIXRI^GADIX 

XwVJTX^OXiv^vJkJ V X i/JLT 1 V vJx XJ-LI <1VX X JX IX l-« X 


SEQID 
NO:283 


11C3 


MffiVKPINAEDTYEl^^ 
LGGYYQGKIJSIASFHQAEH^ 
REQKAGSTL1RHAEEIXREXGADLLW 
LOJroxiv<^vjO v x u isr trLKJXr xi.ijl» i vi i j\ r\ i < x 


NO:284 


11G3 


MffiVBCPINAEDTYEIRHPJIJlPNQPl^ACMYETDLLUCr'lJhH 
LGGYYQGKLISLf\.SFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYEK 
LGFSEQGGVYDIPPIGPHILMYKKLA 


SEQID 
NO:285 


11H3 


MffiVKPINAEDTYEniHRILRPNQPIJEACMYETDLLGGAFH 
LGGYYQGKLISIASFHKAEHSFXEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSVRGYYEK 
LGFSEQGGVYDIPPIGPHILMYKKLT 


ISEOID 


12_1F9 


MIEVKPINAEDTYEIPJ3FJLRPNQP1EACKYETDLLGGTFH 
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NO:286 




LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
X-iLjjt' o jjiV^Ljjj/ v xiL/ur ir x urni i ayx i x\.xvl» x 


SEQID 
NO:287 


12_2G9 


MIEVKPDSfAEDTYEIRHRILRPNQPI^ACKYETDIXGGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 

jx LiVJ-T o Xi v,,/ Urxl V I U X rxr V Urxl 1 1 AVL I ivtSJL. X 


SEQID 
NO:288 


12_3F1 


MIEVKPINAEDTYEIRHRIIJIPNQPLEACKYETDLLGGTFH 

LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 

REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 

XwVJJroXiv£\jvj V I Dlrr V VJxxi 1 1 Avl I XS IS 1 * X 


SEQID 

NO:289 


12_5C10 


ME\^INAEDTYE1RHEIILRPNQPIJEACKYETDLIJjGTFH 

LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 

REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 

Lur o jj/^Oxi V I \Jl\rr X vj-rxl 1 1 ,iyX I xSJvL, X 


SEQID 
NO:290 


12_6A10 


MEVKPINAEDTYEIRHPJIJIPNQPIJEACKYETDLLGGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 

TTT rilHQTHOr^^ArVT^TPP\70PTTrT AAWVT T 
JS..I AjrX^X^VsjlJVJ V I XJiirr V LrJr XI 1 1 <1YL I X\J\ 1 * X 


SEQID 
NO:291 


12_6D1 


MffiVKPINAEDTYEIRHRIIJ^NQPLEACMYETDLLGGTFH 
03GYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 

T /TP C T7 C\(~XCV\ T~\TT\XXyO\ 1CVOXXW A/TVn&T'fcrT T 
X^Ox H oii^vju V I Dirif V OJr XI 1 1 ,IVX X JSJSJ^ X 


SEQID 
NO:292 


12_6F9 


MIEVKPINAEDTYEIRHRIIJRPNQPLEACKYETDIXGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEEIXPJKKGADLLWCNARTSASGYYKK 
.LLrr orsy Lrii V x JJJLr Jr 1 vjJrxl 1 1 *vA X xvlVL. 1 


SEQID 
NO:293 


12_6H6 


lVLmVl<TINAl^^ 

LGGYYRGxnXVSxASMQAI^ 

Yl^Ql^GSTLlIlx^ 

V"T i^T7QT50^T7ArVT^TPP r T/^iPTTTT A/TVT?"KT T 
xsJUOX^oXiv^OXl V x XJXr\r X OJrXI 1 1 jyx I XSJSJ-i X 


SEQID 
NO:294 


12_7D6 


MEVKPINAEDTYEniEIRILRPNQPLEACKYETDLLGGTFH 

LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 

REQKAGSTLIRHAEEIJLRKKGADLLWCNARTSASGYYKK 

T /TRCT?0/^rTl7V^TPPXriPTTTT TV/TVTirVT TP 


SEQID 
NO:295 


12_7G11 


MmVKPINAEDTYEIRHPJLRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
XwOJroxiyLTXi V x L> 1 rr V vjjji 1 L JM x JsJv_U 1 


SEQID 
NO:296 


12F5 


MIEVKPmAEDTYEIRHRILRPNQPLEACMYETDLLGGTFH 
LGGYYQGKLISIASFHKAEHSELEGQKQYQLRGMATLEGY 
RPOT? AO<\TT TOTTAFPT T PK"KnADT T WPNTAPT^V^OYYTCK 
LGFSEQGGIYDIPPIGPHILMYKKLT 


SEQID 
NO:297 


12G7 


MffiVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYQGKLISIASFHELAEHSELEGQKQYQLRGMATIJEGY 
REQKAGSTLIRHAFFT ,T .RKKGADLLWCNARTS VSGYYKK 
LGFSEQGEVYDIPPIGPfflLMYKKLT 


SEQID 
NO:298 


1_2H6 


MIEVKPINAEDTYEIRHPJLRPNQPIJEACMYETDIXGGAFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEFI ,T RKKGADLLWCNARTS AS G YYKK 
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LGFSEQGGVYD1PPIGPHILMYKKLT 


SEQID 
NO:299 


13_12G12 


MIEVTCPINAEDTYEIRHRIIJRPNQPI£ACMYETDLLGGTFH 

lggyyrgklisiasfnqaehpelegqkqyqlrgmatleey 
reqkagstlirhaeeixrkkgadixwcnartsasgyykk 

LGFSEQGEVYD1PPVGPHILMHKKLT 


SEQID 
NO:300 


13_6D10 


MIEVKPINAEDTYEmHRlIJ^NQPIJEACMYETDSLGGTFH 
LGGYYRGKLISIASFNQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDTPPVGPHILMYKKLT 


SEQID 
NO.301 


13_7A7 


MEVKPINAEDTYEIRHRILRPNQPLEACMYETDLLRSAFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDTPPVGPHILMYKKLT 


SEQID 
NO:302 


13_7B12 


MEVKPINAEDTYEIPvHRILRPNQPLEACKYETDLLGSTFHL 
GGYYRGKIJSLASFHQAEHPELEGQKQYQLRGMATLEGYR 
EQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKKL 
GFSEOGEVYDIPPTGPEDDLMYKKLT 


SEQID 
NO:303 


13_7C1 


MIEVKPINAEDTYEIPJEIRrLPJ > NQPL£ACKYETDLLRGAFH 
LGGYYRGKLiSIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLERHAEELLRKKGADLLWCNARTSARGYYKK 
LGFSEQGEVYDIPPTGPHILMYKKLT 


SEQID 
NO:304 


13_8G6 


MffiVBCPINAEDTYEIRHPJLRPNQPLEACKYETDSLGGTFH 
LGGYYRGKLISIASFNQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGGVYD1PPVGPHILMYKKLT 


SEQID 
NO:305 


13_9F6 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEEIJLRKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDIPPVGPH1LMYKKLT 


SEQID 
NO:306 


14_10C9 


MIEVKPINAEDTYEIRHRILRPNQPI^ACKYETDIJLRGAFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDTPPVGPHILMYKKLT 


SEQID 
NO.307 


14_10H3 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLRGAFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTS ASGYYK 
KLGFSEQGEVYDTPPVGPHILMYKKLT 


SEQID 
NO:308 


14_10H9 


MffiVKPlNAEDTYEIRHRILRPNQPLEACKYETDLLRGAFH 
LGG YYRGKLVS IASFHQ AEHPELEGQKQ YQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
KLGFSEQGEVYDTPPVGPHILMYKKLT 


SEQID 
NO:309 


14_11C2 


MEVKP1NAEDTYEIRHRILRPNQPIJEACKYETDLLGSTFHL 
GGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEALLRKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDTPPTGPHILMYKKLT 


SEQID 
NO:310 


14_12D8 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEALLRKKGADLLWCNARTSASGYYK 
KLGFREQGGVYDIPPVGPHILMYKKLT 


SEQID 


14_12H6 


MIEVKPINAEDTYEIRHRIIJIPNQPLEACKYETDLLGGAFH 
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NO:311 




LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEET .T .RXKGADLLWCNARTS ASGYYKK 

. LAJX^oXwV^vJrxi V I xJUrx X vjfx xj XLJ.VX I iVxvi < 1 


SEQID 
NO:312 


14_2B6 


MffiVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFNQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEET T RKKGADLLWCN ARTS ASGYYKK 

t rjTn^inorir^vvTiTPPvrTpWTT A/rv r K"K'T t 
ivVjrorr^ou v i xjllx v vjjrn. i, lj.vx x xvix l> i 


SEQID 
NO:313 


14_2G11 


MffiVKPINAEDTYEIRHRD^NQPLEACKYETDLLRGAFH 
1J3GYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YMQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 

IS. 1 X ifcs MA^f I Ttl V X XJ XJT JT X * Tr n 1 1 ,IVI x j\J\1j X 


SEQID 
NO:314 


14_3B2 


1S/DEVKPINAEDTYEIRHPJDLRPNQPLEACKYETDIXRGAFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEALLRKKGADLLWCNARTSASGYYK 

JsJ^vJX^o j^v^Lj vjr V I Dxrr AxJiril 1 1 ^IVL I iSJSJ_>l 


SEQID 
NO:315 


14_4H8 


MIEVKPINAEDTYEIRmiLPJNQPI^ACKYETDLLGSTFHL 
GGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGYR 
EQKAGSTIJRHAEELLRKKGADLLWCNARTSASGYYKKL 
vjroxiv^vjxi V x D x ir It v OirrxxJ /ivi x isjsjl, x 


SEQID 
NO:316 


14_6A8 


MIEVKPINAEDTYEIRHRILRPNQPLEACMYETDLLGGTFH 
LGGYYRGKLVSIASFNQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKG ADLLWCN ARTS AS GYYK 
J&J^CjrJr o Ji vjJti V IJJ1 Jrr V Lrrxi V LJV1 x xvisX, x 


SEQID 
NO:317 


14_6B10 


NGEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTL1RHAEELLRKKGADLLWCN ARTS ASGYYKK 

Lur o xiV^^- 1 ^- 1 ▼ x xJlVJJrJr V Uril II JVx x xSJVX-» X 


SEQID 
NO:318 


14_6D4 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFNQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEALLRKKGADLLWCN ARTS ASGYYKK 

X^vJrx^oJ&v^vjrxi V IJJ1 xx V vJi IjJULflVl X XSJtSJLr X 


SEQID 
NO:319 


14_7A11 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLRGAFH 
LGGYYRGKLVSIASFHQAEHPELEGLKQYQLRGMATLEG 
YREQKAGS TLIRHAEELLRKKGADLLWCN ARTS AS GYYK 
xvLrvjr'oxivJ Oxi Vil/1 JrJr 1 LrJrxl 1 1 .IVl x JSJvx^ 1 


SEQID 
NO:320 


14_7A1 


MIEVIG^INAEDTYEi^^ 

LGGYYRGKLVSIASraQAEOTEl^GQKQYQlJRGMATLEE 

YREQKAGSTxJtftH^^ 

KIAjx^o JlC^ Cjli V I JJ 1 x^ir Avjjrli 1 1 /1V1 x xsJSJLv 1 


SEQID 
NO:321 


14_7A9 


M1EVKPINAEDTYEIRHRII.RPNQ 

LGGYYRGl^VSIASraQAKHPEIXGQKQYQLRGMATlJBG 
YPFOKAfrSiTT TRRAFFT J RKKGADLLWCN ARTS AS GYYK 
KLGFSEQGEVYDTPPVGPHI^^ 


SEQID 
NO:322 


14_7G1 


NIIEVKPINAEDTYEI^^ 
LGGYYRGKLISIASFNQAEl^^ 
REQl^GSTLIRHAEAIXRI^ 
LGFSEQGEVYDTPPVGPimMYKXLT 


SEQID 

NO:323 


14_7H9 


MffiVKPlNAEDT^ 

LGGYYRGKLVSIASFHQAEOT^ 

YRE0KAGSTL1RHAEEIXRKX 
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*TT OF^FnOFVYnTPPVOPTTTT MVHTCT T 


SEQID 
NO:324 


14_8F7 


MIEVKPINAI^TYEIRHRxLRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEE 
YREQKAGSTLTPJiAEAIXRKKGADLLWCNARTSASGYYK 

xvx^vjx^oijfV^vjx-f v x xsMJCjr x vjrn 1 1 a vx x xvr\ ,i >x 


SEQID 
NO:325 


15_10C2 


MffiVKPINAEDTYExRHRD^NQPLEACKYETDLLRGAFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTTASGYYK 
t^t OFSPnnFVFnTPPTfrPTTrT A/rvnK"Kir t 

xsj_<vji7 o x_«v^ kjxj- v jruirr x vjx xi 1 1 itvx x x\ rs i > x 


SEQID 
NO:326 


15_10D6 


MIEVKPINAEDTYEIRHRILPJNQPLEACMYETDLLGGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEET J .RKKGADLLWCNARTS ASGYYK 
TfT fiFSFOOFVYTYrPPVOPTTTT A/TYTCKT T 

XX l jVJI OX->V^/V-JJ— < V X J— 'IX 1 V VJi XXXlwlVX X R ra 1 i X 


SEQBD 
NO:327 


15_11F9 


MIEVKP1NAEDTYEIRHRILRPNQPLEACKYETDLLRGAFH 
LGGYYRGKLVSIASFNQAEHPELEGQKQYQLRGMATLEG 
YP^QKAGSTLIRHAEEIJJIRKGADLLWCNARTSASGYYK 

ITT OPQPOriTH VVr>TPPTfrPT4TT lVfVT^'K'T X 
IV 1 A Tr^ HA^JV Tr, v I i^/JJr Jr A vJrlllJ ilVI X J\, IN. 1-/ J. 


SEQID 
NO:328 


15_11H3 


NlIEVKPINAEDTYEIRHMIJIPNQPLEACKYETDLmGAFH 

LGGYYRGKLIS1ASFHQAEHPEJLEGQKQYQLRGMATLEGY 

REQKAGSTLIRHAEAIiRKKGADLLWCNARTSASGYYKK 

T OF<2TnOrxFVVT)TPPTfTPTTrT TV/TV"K"K'T T 
XjOFoIII^IJxI V I Uirr 1 Ol^xxJLLflYl I XSJNJ-i X 


SEQ ID 
NO:329 


15_12A8 


MIEVKPINAEDTYEIRHRxLRPNQPLEACKYETDLLGGTFH 

LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 

REQKAGSTLHIHAEAIJLRKKGADLLWCNARTSASGYYKK 

X-iLjr' OXiV^CJXl V I XJlr a X VJX^lTJJLrlVl X XSJtVX-r X 


SEQID 
NO:330 


15_12D6 


MIEVKPINAEDTYEIRHRILPJPNQPLEACMYETDLLRGAFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 

xri r:pcT5nrTPVVTYIT>PVfTPPrTT VTVRTKT T 
l\JLA TX^O HA^Il Tr. V X X I I V 1 XxXXjlVX x XX IX 1 i X 


SEQID 
NO:331 


15_12D8 


MIEVKPINAEDTYEIRIiRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTL1RHAEELLRKKGADLLWCN ARTS ASGYYK 
vj npsFnriifVYn tppvoppttt mvtctct t 

' a rr.^r.wt Try V I X-*XT Jr V VJJrXi 1 1 «IVX X XX IN 1 < X 


SEQID 
NO:332 


15_12D9 


MIEVKPINAEDTYEIRHPJDLRPNQPLEACKYETDLLRGTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEELIJRKKGADLLWCNARTS ASGYYK 
kt np^pnopwnTPPVfrPPnT a/tvtcki t 

IX 1 A Tr.^ rA Tr, V X UJJTJT V VJX fill .1 V 1 X IN IX 1 ^X 


SEQID 
NO:333 


15_3F10 


MIEVKPINAEDTYEIPvHRILRPNQPLEACKYETDLLRGAFH 

LGGYYRGKLISIVSFHQAEHPELEGQKQYQLRGMATLEGY 

REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 

X-»vJJr , olil^vJ-C« V I U X x x /^VJx XxXL>lVX X X JVL» X 


NO:334 


15 

X^J «/VJ X X 


MTRVKPTN ARPTYPXRHR TT RPNOPT JR ACKYRTDT J GOTFH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTS ASGYYK 
KLGFSEQGEVYDJPPVGPHILMYKKLT 


SEQID 
NO:335 


15_4F11 


MlEVKTINAEDTYKERHRnJRPNQPI^ACMYF^IXGGT^ 
LGGYYRGKLVSIASFNQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEALLRKKGADLLWCNARTSASGYYK 
KLGFSEQGEVYD1PPTGPHILMYKKLT 


SEQID 


15_4H3 


MIEVKPINAEDTYEIRHRIIJRPNQPI^ACKYETDLLGGTFH 
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NO:336 




LGGYYRGKLVSIASFHQAEHPEDEGQKQYQLRGMATLEE 
YPOEQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
ttt GF^FnGFVYDIPPTGPHTLMYKKLT 

IVLjVJI uJUV,/VJJ-/ V X X^" JUL X X VJA J. 1 1 1 /1YX JL IVXVIjI 


SEQID 
NO:337 


15_6D3 


MIEVKPINAEDTYEIRHEIJ1JIPNQPIEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPFJ^GQKQYQLJIGMATLEEY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTS AS GYYKK 
LGFSEOGEVYD1PPTGPHILMYKKLT 


SEQID 
NO:338 


15_6G11 


MIEVKPINAEDTYEIRHPaLJRPNQPIEACKYFn'DIJ^GAm 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEELLRKKGADLLWCN ARTS AS G YYK 
TCf GFSFOGKVYTTIPPVGPHTLMYKKLT 

XVXyVJJ. OXjV^VJ XV V X X/XL X V VJX X 1 1 1 , i 1 YX X XVXVX-4 X 


SEQID 
NO:339 


15_9F6 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 

LGGYYRGKJUSIASFHQAEHPELEGQKQYQLRGMATLEEY 

REQKAGSTLIRHAEELLRRKG ADLLWCNARTS AS G YYKK 

T nF^FnnFVYTJTPPVGPTTTT MYKKLT 
ijurocv^uc v x XJix x v urn 1 1 «ivx x > x 


SEQID 
NO:340 


15F5 


mmvkpmaedtyeipjipalrpnqpleackyetdllggtfh 
lggyyrgkijsiasfhkaehselegeeqyqlrgmatlegy 
reqkagstliryaeeixrkxgadllwcnartsvsgyykk 

T OF^FnOFVYTjiPPIGPHTT MYKKLT 

XvVJXOXwV^VJJJ) V X X-/XX X XVJX X XXLilVX X JVLVJU X 


SEQID 
NO:341 


16A1 


IvlIEVKPINAEDTYElRHR^ 

LGGYYQGKLISIASFHKAEHSGIJEGEEQYQLRGMATl^ 
REQKAGSTLIRHAEE1JLRKKGADLLWCN ARTS VS G YYEK 

X^VJX^Ox^V^VJXJ/ V X J-/XX X XVJX XXXJL*iVX 1 IN. XX 1 j X 


SEQID 
NO:342 


16H3 


IVllDVKPINAEDTY^ 

LGGYYQGIO.ISIASmQAEHSEI^GQKQYQLRGMATLEGY 

REQKAGSTL1RHAEEIJURXKGADLLW 

T OFWOOFVYTj TPPTOPITTT MYK1CLT 

X-AJJT OX-A^VX-D V X XJ XX X XVJX X XXXjlLVX X XVX\ 1 J X 


SEQID 
NO:343 


17C12 


Mffi VKPIS AEDT YEIRHRILRPNQPLE ACMYETDIXGG AM 

LGGYYQGKI^ISIASFHQAEHSEI^GQKQYQLRGMATIJBGY 

REQKAGSTURHAEEIXRK^ 

T OT7C!T7nrTR WD TPPT(TP"RTT VJTYTCK1 T 


SEQID 
NO:344 


18D6 


NlIEVKTINAEDTYElRm 

LGGYYRGxsXISIASFHKAEffi 

REQKAGSTLIRHAEEIiRKX 

T OP^PnrrFVVTjTPPTrTPHTT IVTYKK'T A 

XjVJTJT O JG»V^ \J±~i V X LJXXT X XvJX XXXJL A YX X XN IV 1 /xx- 


SEQID 
NO:345 


19C6 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLICIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLlRJEIAEELX,RKKG ADLLWCNARTS VRG YYEK 
T GFSFnGGVVnTPPTGPHTLMYKKLA 

X^VJX^OXjV^vJ vJ V X XV XX X XVJX XIII a VX X 1V1VUO. 


SEQID 
NO:346 


19D5 


MIEVKPINAEDTYEIRHCILRPNQPLEACMYETDLLGGTFH 
LGGYYQGKLISJASFHKAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTL1RHAEEIXRKKGADLLWCNARTSVSGYYKK 
LGFSEQGEVYDIPPIGPHIIJylYKKLT 


SEQID 
NO:347 


20A12 


MJEVKPEvTAEDTYEnmRILRPNQPLEACIVrYETTJLLGGTFH 
LGGYYQGIO.ISIASFHNAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLJRHAEELLRKKGVDLLWCNARTS VSGY YKK 
LGFSEQGGIYDIPPIGPHIIJVxYKKLA 


SEQID 
NO:348 


20F2 


I^VKPINAEDTYElRHRILRPNQPIJEACMYETDIXGGTm 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSVSGYYEK 
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T rtPQT5r>fTT7VYT>TPPTOP'HTr MY"K"KT T 


SEQID 
NO.-349 


2.10E+12 


MffiVKPINAEDTYEIRHRILRPNQPLEACKYETDIXGGAFH 
LGGYYQGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHA EFT I RKKGADLLWCNARTSVSGYYKK 
T CrF^FOOF WnTPPTnPTTTT MYTCKT .T 


SEQID 
NO:350 


23H11 


MIEVKPINAEDTYEIRHRILRPNQPLEACMYETDLLGGTFH 
LGGYYQGKLISIASFEKAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYEK 
T OFSFOGFVYD1PPTGPHIEMYKKLA 


SEQID 
NO:351 


24C1 


MffiVlCPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 

LGGYYRDRLISIASFHQAEHSELEGQKQYQLRGMATLEGY 

REQKAGSTIJRHAEELLRKKGADIXWCNARTSVSGYYKK 

T nRQFOrjFVVT^TPPTrjPHTT MYK'K'T T 
L\jrjE\^\jjjd v x x^urirxvjir Jn I ■ A\± x rv r\ i <x 


SEQID 
NO:352 


24C6 


Mffi\^INAEDTYEIPaiRILRPNQPLEACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
I^QKAGSTLIRHAEEIXRKKGADLLWCNAWSVSGYYKXL 

r»FQFri<TTlV\ r nTPPTr : rP'HlT X/TV"K"K'T a 


SEQID 
NO:353 


2.40E+08 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHNAEHSEIJEGQKQYQLRGMA'rLEGY 
REQKAGSTLIRHA EF.T ,T RKKGADLLWCNARTSASGYYEK 

T riT7QT30^T7\7A r riTPPTOP'HTT AAVKHK'T A 
J^OJroxi^Lixi V x JJiJrJrlvj-r Jn 1 1 Avi I iSJSJLi/v 


SEQID 
NO:354 


2_8C3 


MIEVl<PlNAFJ3TYElRHRlLRPNQPl^ACMYF/rDLLGGTFH 
LGGYYRDRLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKA.GSTLIRHAEELLRKKGADLLWCNARTSASGYYEK 
t nT7<:FonF wnTPPTrspwTT vmrifT T 


SEQID 
NO:355 


2H3 


I^VKPINAEDTYEIRHM 

LGGYYQGICLISTASmQAGHSEIJEGQKQYQIJRGMATLEG 
Y1^RKA.GSTL]RHAEELLRKXGADLLWCNARISAS 

X^VJjrOXwV^ vJ vj V X X^rXX X JLvJX X 1.1 I-jXVX X XVXV.I .j X 


SEQID 
NO:356 


30G8 


M1EVXPINAEDTYEIRHRI1JRPNQP1JEACMFETDLLGGAFH 

LGGYx'QGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 

REQKAGSTLIRHAEELLRKKGADLLWCNARTSVSGYYKK 

T fT-FCFriOF VVnTPPTOPTTTT MY~K"KT T 
XwVjriroJ_*>s^vJXi v x jl^ixx xvjx xi 1 1 <ivx x x\ r\ i »x 


SEQID 
NO:357 


3B_10C4 


MIEVRPINAEDTYE^ 

LGGYYRGl^ISIASraQAEHSELEGQKQYQLRGlVlA 
REQKAGSTLllxHAE 

l^vjrx^oxjV^VXCf/\ x xJxx x WJx xxxx-fivx x x\ rv i *x \ 


SEQID 
NO:358 


3B_10G7 


lvnEvKPINAEDTYExRHRILlU>NQPLEACMYETDIXGGTra 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLlPxHAEElXRKKGADIXWCNARTSASGYYKK 

l^\jrOMJ*\J\J\J V I xJxxxxKJx XJ.1-I->1VX x XV IX 1 * X 


NO:359 


3B 12B1 


MTRVKPTN AEDTYEIRHRrLRPNQPLEAC MYETDLLGGTFH 
LGGYYRGKLISxASFHQAEH^ 
REQKAGSTLIRHAEEIJ^RKX 
LGFSEQGEVYDlPPIGPHILlvmOG^ 


SEQID 
NO:360 


3B_12D10 


MIEVKPmAEDTYEIRHRII^ 
LGGYYRGl<XISxASFOTAEHSEIJEGQ 
REQKAGSTLIRHAEEIXRl^ 
GFSEOGEVYDIPPIGPiraJVlTKxvX,^ 


SEQID 


3B_2E5 


MIEVlsTINAEDT^ 
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NO:361 




IXjGYYRGKUSIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHA EFT I RKKGADLLWCNARTSASGYYEK 

Lur o xv^Lxc, V I JLJJJrr^lvjrJrJn 1 1 #IVL I JSJSJL,! 


SEQID 
NO:362 


3C_10H3 


MIEVKPINAEDTYEIRHRILRPNQPLEACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHA EF,T ,T ■RKKGADLLWCNARISASGYYKKL 
vjJroiiV^dLj V x UJJrJr V Kjrrtl II ,jyi x JvJv-L/ 1 


SEQID 
NO:363 


3C_12H10 


MffiVKPINAEDTYEIRHRILRPNQPIJEACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
RGQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYEK 


SEQID 
NO:364 


3C_9H8 


MEVKPINAEDTYEIRHRILRPNQPLEACMYETDLLGGTFH 
LGGYYQDRLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIR Y AEELLRKKGADLLWCN ARIS AS G Y YEKL 
Or'oritJCjrJiV Y JUJLrJrKjirrl 1 1 ,IVL i JsJsJL. 1 


SEQID 
NO:365 


4A_1B11 


MIEVKPINAEDTYEIRHRILRPNQPLEACMYETDLLGGTFH 
LGGYYRGKLISIASmQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYEK 
L\jri>E\l LjCi V Y I jikkktKHII ,m y JvivL 1 


SEQID 
NO:366 


4A_1C2 


MIEVKPESfAEDTYEIRHRILRPNQPIJEACKYETDLLGGTFH 
LGG YYRGKLIS I ASFHQ AEHSELEG QKQ YQLRGMATLEE Y 
REQKAGSTLIRHAEELIIiKKGADLLWCNARTSASGYYKK 
LGFSEQGE V YDIrPlCjFJM 1 1 .M Y KKL 1 


SEQID 
NO:367 


4B_13E1 


MffiVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLIS1ASFHQAEHPELEGQKQYQLRGMATLEEY 
PJBQKAGSTLIRHAEELLRKKGADLLWCNARISASGYYEKL 
Gr SEQGEV YDlrPlQFJlJLLM Y KKL, 1 


SEQID 
NO:368 


4B_13G10 


MIEVKPINAEDTYEIRHRILRPNQPLEACMYET^ 
LGGYYRGKLISlASmQAEHSEIJEGQKQYQLRGMATLEGY 
MQKAGSTIJRHAEEIXRKKGAD 
LGFSEQGG V YDIPPIGF YILM Y KJsJL I 


SEQID 
NO:369 


4B_16E1 


MIEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGroEQGG V YDiFJrlCjrriJI Y JsJvi^ 1 


SEQID 
NO:370 


4B_17A1 


IVQEVI^INAEDTYEIRHRIIJRPNQ 
LGGYYRGKLISIASI^QAEHPEI^ 

REQKAGSTLIRHAEELLRKKGADLLWCNARTS AS G YYEK 

T OT7CT3/^r^T3\rVT|^TTDTDT^OTTTT A/TVVVT T 

lXrrMiyLxCrV I IJiJr r^lLjJr Jn J J /iVl I JsJSJ^l 


SEQID 
NO:371 


4B_18F11 


MIEVNPINAEDTYEIRHR^ 
LGGYYRG10JSlASFx^AEHSELX>GQ 

LGFSEQGEVYDIPPIGPHISMYKKLT 


SEQID 
NO:372 


4B_19C8 


MIEVKPn^AEDTYEIRHRILRPNQPIJEACKYETDLLGGTFH 
LGGYYRGKOSIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGGVYDIPPIGPHILMYKKLA 


SEQID 
NO:373 


4B_1G4 


MEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGAFH 
LGGYYRGKLISIASFHQSEHPELEGQKQYQLRGMATLEGY 
RELKAGSTLIRHAEELLJIKKGADIXWCNARISASGYYKKL 
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GFSEOGEVYDIPPIGPHnJt/IYKKLT 

LJ A VV^V F F- J ▼ JL JL-^JLJL JL XV^IX -X. * 1 * 4\ ▼ JL -*- ■ ■■ # -i» 


SEQID 
NO:374 


4B_21C6 


IVnEVKPmAEDTYEnonOLRPN^ 

LGGYYRGKLISIASFHQAEHSEl^GQKQYQLRGMATLEEY 
REQKAGSTLIRHAEEIJJlKKGADLLWCNAiaSASGYYK^ 
GFSEQGGVYDIPPIGPHILMYKBCLT 


SEQID 

NO:375 


4B_2H7 


MIEVKPINAEDTYEIRHRILRPNQPIJEACMYETO 
LGGYYRGKLISIASFHQAEHSEIJEGQKQYQLJRGMATLEGY 
REQKAGSTLIRHAEELIJIKKGADLLWCNARTSASGYY^ 
LGFSEOGGVYGIPPIGPHIIJvIYKKLT 


SEQID 
NO:376 


4B_2H8 


MffiAKPINAFJDTYEmHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEOGEVYDH'PIGPHEJyiYKKLT 


SEQID 
NO:377 


4B_6D8 


MffiVKPINAEDTYEIRHIUIJIPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
T OFSPHGFVYT3IPPIGPHILMYKKLT 

1AJI OJJiilXJij V JL XxXX X JLV_J JL X LLL^ITX X 


SEQID 
NO:378 


4B_7E8 


MIEVKP^^^AEDTYE^lHRjXRPNQPLEAC^^^TDLLGGTFH 
LGGYYRGKILJSIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKG ADLLWCN ARTS VS GY YKK 
t riFSFnnFVYnTPPTGPHlLMYIGKLT 

LiVJ r O IJiV^ VJ Xi V X X-'XX X XVJX IXIJUIVI X 1V1VXJ X 


SEQID 
NO:379 


4C_8C9 


MffiVJKPINAEDTYEIRHRIjLRPNQPLEACMYF/rDIJLRGAFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
P^QKAGSTLHIHAEELLRKKGADLLWCNARTSASGYYEK 
T GFSEOGEVYDIPPIGPHILMYKKLT 


SEQID 
NO:380 


4H1 


MffiVKPINAEDTYEnmRILRPNQPEEACMYETDLLGGAFH 
LGGYYQGKLISIASFHQAVHSELEGQKQYQLRGMATLEG 
YREQKA GSTLIRHAEELLRKKG ADLLWCN ARTS VSGYYK 
KJ GFSFOGGVYDlPPIGPHnJvrYKKLT 


SEQID 
NO:381 ■ 


6_14D10 


MffiVKPmAEDTYEHmRILRPNQPLEACMYETDLLGGTFH 
LGGYYRGKXISIASFHQAEHSELEGHKQYQLRGMATLEEY 
REQKAGSTLElHAEEIiRKXGADIXWCNARTSASGYYKK 
T GFSEOGGVYDjTPVGPHttMYKKLT 


SEQID 
NO:382 


6_15G7 


]VLIEVKTINAEDTYEJT<HRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKIJSIASmQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEEIX^KXGADIXWCNARTSASGYYKK 
LGFSEOGEVYDIPPVGPHnJVIYKKLT 

1 -* V,.. J JL V r J f'^L 1 - r . J ▼ JL JL^ JLJL JL T JL .1 1 > rfi tj. ■ ^ * » ■ * * 


SEQID 
NO:383 


6_16A5 


MIEVKPINAEDTYEIPTIRn^RPNQPLEACKYETDIXGGlEll 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTUKHAEE1T.RKKGADLLWCNARTSASGYYKK 
T OF<?FOGGVYniPPVGPHILMYKKLT 

1 A. T 1 wJ XJ/ VJ v_J V X L/JLTl V NJ1 XIII /XV X x iixvxwx 


SEQID 
NO:384 


6_16F5 


MffiVKPmAEDTYEmHRJXRPNQPl^ACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAVHSELEGQKQYQLRGMATLEGY 
REQKAGSTLJRHAEELJJIKKGADLLWCNARTSASGYYKK 
LGFSEOGGVYDIPPVGPHILMYKKLT 


SEQID 
NO:385 


6_17C5 


MffiVKPINAEDTYElRHRILRPNQPI^ACKYEADLLGGlJrri 
LGGYYRGBCLIS1ASFHQAEHPELEGQKQYQLRGMATLEGN 
REQKAGSTL1RIIAEELLRKXGADIXWCNARTSASGYYKK 
LGFSEOGEVYDWPIGPHILMYKKLT 


SEQID 


6_18C7 


MEVKPINAEDTYEIRHRJOJ^NQPLEAC^ 
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NO:386 




LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEEIJ^RKKGADIXWCNARISASGYYKKL 
GFSEQGEVYDffPVGPHIIMYKKLT 


SEQID 
NO:387 


6.18D7 


MmVKPmAEDTYEIRXRILRPNQPLEACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
PJEQKAGSTLIRHAEFJJJRKKGADLLWCNARTSASGYYKK 
LGFSEQGGVYDIPPVGPHILMYKKLT 


SEQID 
NO:388 


6_19A10 


MIEAKPINAEDTYEmHPvJIJIPNQPLEACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATIEGY 
REQKAGSTLIRHAEEIJURKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDIPPTGPHILMYKKLT 


SEQID 
NO:389 


6_19B6 


MIEVKPINAEDTYEIRHRILRPNQPIJEACMYETDIJL.RGAFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLniHATiELLRKKGADIiWCNARTSASGYYKK 
LGFSEQGEVYDIPPVGPHIIMYKKLT 


SEQID 
NO:390 


6_19C3 


MIEVKP1NAEDTYEIRHRILRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLOIHAEELLRKKG ADLLWCN ARTS AS GY YKK 
LGFSEQGEVYDIPPIGPHILMYKKLT 


SEQID 
NO:391 


6_19C8 


MffiVKPmAEDTYEIRHRILRPNQPLEACKYETDLLGGTLH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRQAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEOGGVYDIPPVGPHILMYKELT 


SEQID 
NO:392 


6_20A7 


MIEVKPIN AFX)T YEIPJ3RILRPNQPLEACMYETD If A 
LCKjYYRGKLISIASFHQAEHSDLEGQKQYQLRGMATLEEY 
REQKAGS TLIRHAEELLRKKG ADLLWCN ARTS AS G Y YKK 
LGFSEQGEVYDIPPVGPHIIMYKKLT 


SEQID 
NO:393 


6_20A9 


MffiVKPINAGDTYEIRHEaLRPNQPLEACKYETDLLGG 1 ±^ri 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGGVYDlPPVGPHnJMYKKLT 


SEQID 
NO:394 


6_20H5 


MJEVKPIN AEDTYEIRHRILRPNQPLEACKYETDLLGG 1 EEL 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
PJEQKAGSTL1RHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDIPPIGPHILMYKKLT 


SEQID 
NO:395 


6_21F4 


MIEVKPlNAEDTYEniHRVIJRPNQPLEACMYETDLLGGAF 
HLGGYYRGKOSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTURHAEEIXRKKGADLLWCNARTSASGYYK 
KLGFSEOGEVYDWPVGP1TILMYKKLT 


SEQID 
NO:396 


6_22C9 


MIEVKPINAEDTYEIRHRILRPNRPIEACMYETDILGGTFH 
LGGYYRGKLISIASFHQAEHPGLEGKKQYQLRGMATLEEY 
PvEQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGGVYDIPPVGPHILMYKKLT 


SEQID 
NO:397 


6_22D9 


NHEVKTINAEDTYEIRHRILRPNQPLEACMYETDLLEGlbtl 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKG ADLLWCN ARTS AS GYYKK 
LGFSEQGEVYDIPPVGPHILMYKKLT 


SEQID 
NO:398 


6_22H9 


MIEVKTrN AEDT YEIRHRDJRPNQPLEACMYETDLLG GTFH 
LGGYYRGKLISIASmQAEHSELEGQKQYQLRGMATLDEY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
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SEQID 
NO:399 


6_23H3 


MffiVKPINAEDTYEIPJnULRPNQPLEACMYGTDLLGGTFH 
LGGYYRGKUSIASFHQAEQPELEGQKQYQERGMATLEGY 
PJEQKAGSTLIRHAEEIXRKKGADLLWCNARTSASGYYKK 
T OFSIFOOGVYnTPPVfrPHTT MYTTKT T 

XjVJX OUV^VJVJ V JL JL/irr V V.J JL X i 1 1 <iVX X XVXVX-sX 


SEQID 
NO:400 


6_23H7 


MIEVKPINAEDTYEIRHRILRPNQPLEACMYETDLLGGTFH 
LGG YYRGKLIS IAS FHQ AEHSELEG QKQ YQLRGMATLEG Y 
REQKAGSTLIRHAEEJl.RKKGADIi,WCNARTSASGYYKKL 
OFSFOOOVYDTPPVOPHTT MYTCTCT T 


SEQID 
NO.401 


6_2H1 


MIEVia ) mAEDTYEniHRVLRPNQPLEACMYETDLLGGTF 
HLGGYYRGKLISIASFHQAEHPELEGQKPYQLRGMATLEG 
YREQKAGSTLIRHAEET I ,RKKGADLLWCNARTS ASGYYK 
KT OFSFOOFIYDTPPTGPHILMYKTCT T 

JVJLAJX, 01jV^\JJ_/JL X XV XX X XVJX XXJLL/1VX X XV1\ 1 j X 


SEQID 
NO:402 


6_3D6 


NxffiKPINAEDTYEntHPJn-^ 

GGYYRGKLISIASFHQAEHPFJJEGQKQYQLRGMATLEGYR 
EQKAGSTLIRHAEELXRKKGADLLWCNARTS AS GYYKKL 

rTR^FrjriFVVnTPPVOP'HTT MVTfKT T 


SEQID 
NO:403 


6_3G3 


MffiVKPINAEDTYEIimRILPJPNQPLEACMYETDLLGGTra 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLJPvHAEELJ^KKGADIJl-WCNARTSASGYYKK 

J_AJX*0 JJiV^vJXj V 1 J-/lx XT V VJJT Xl 1 1 »IVX X J\ IS 1 . X 


SEQID 
NO:404 


6_3H2 


MIEVKPINAEDTYEIRIIRII^^ 

LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 

REQKAGSTLIRIIAEELLRKKGADLLWCNARTSASGYYKK 

T OFSFOflF WDTPPVOPTITT MY"K"RT T 
x-*Vjx^o1-#v^vjx^ v x x-'xir x v vJJrxi 1 1 «ivx x jmm < x 


SEQID 
NO:405 


6_4A10 


MffiVKPINAEDTYE^^ 
LGGYYRGIOJSIASffiQAEHP^^ 
l^QKAGSTLIRHAEELLRKKGADx^ 
xwVjrojCfV^vJJ^ v x Jjirr v vjrn 1 1 <ivx a xvxsjo x 


SEQID 
NO:406 


6_4B1 


MffiVKPINAFJDTYEIPJroVLRPNQPLEACMYETDLLGGTF 
HLGGYYRGKLIGIASFHQAEIiPELEGQKQYQLRGMATxJE 
G YREQKAGSTLIRHAEELLRKKGADLLWCNARTS AS GYY 
FKT OFSnOOFVYnTPPTfrPHTT MYTCTCT.T 

r*. IX 1 A TT.^k ^ * X UljLX XVJX XI 1 1 »1VX liMM 'X 


SEQID 
NO:407 


6_5D11 


MJEVKPINAEDTYEJRHRILRPNQPLEACMYETDIXGGTFH 

LGGYYRGKIJSIASmQAEHPELEGQKQYQLRGMATLEEY 

REQKAGSTURHAEEIJJIKKGADIXWCNARTSASGYYKK 

T OFSFOnEVYDTPPTOPTTn MYTCTfT T 
i_*vji oXjV^vjXj v x x-^xjrjrxvJx xi 1 1 tivx x xvxvx-* x 


SEQID 
NO:408 


6_5F11 


MffiVKTINAFX)TYE]RHPJIJlPNQPLEACMYETDLLGGTFH 
LGGYYRGKLISL\SFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLDIHAFELLRKKGADLLWCNARTSASGYYKK 
T OFN5Fnr»FVWnTPPVOP'HTT MYKKI T 


SEQ ID 
NO:409 


6_5G9 


MEEVKPIN AI^T YE1RHRILRPNQPLE ACMYETDIXGGTra 
LGGYYRGKLKIASraQAFJISia^GQKQYQLRGMATLEEY 
REQKAGSTLIRHAEELLRKKGADIXWCNARISASGYYKKL 
GFSEQGGVYDIPPVGPHxLMYKKLT 


SEQID 
NO:410 


6_6D5 


MJEVKPMAEDAYEIRHRJXRPNQPLEACKYETDIXGGTFH 
LG G YYRGKLIS IAS FHQ AEHSELEGQKQ YQLRGMATLEG Y 
REQKAGSTL1RHAEELLRKKGADLLWCNARTSASGYYKK 
IXjFSEQGGVYDjTPVGPHIXMYKKLT 


SEQID 


6_7D1 


MJEVKPINAEDTYEIRlxRJXRPNQPI^ACMYETDIXRGAFH 



- 188- 



WO 02/36782 



PCTYUS01/46227 



NO:411 


] 


LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
r GFSEOGGVYDIPPVGPHILMYKKLT 


SEQID 
NO:412 


6_8H3 


MIEVKPMAEDTYEIRHEULRPNQPLEACMYETDLLGGTFH 
LGGYYRGKXISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEEIXRKKGADIJ^WCNARTSASGYYKK 
LGFSEOGGVYDIPPVGPHILMYKKLT 


SEQID 
NO:413 


6_9G11 


MmVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTLH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
PJEQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
T GFSEOGEVYDIPPVGPHILMYKKLT 


SEQID 
NO:414 


6F1 


MEVKPINAEDTYEKHRILRPNQPLEACMYE1DLLGGTFH 
LGGYYRGKLVCIASFHKAEHSELEGQKQYQLRGMATLDG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSVSGYYE 
KT GFSEOGEVYDjTPVGPHELMYKKLT 

, \jLiVJ-l LJj_«V^VJjL-< V X JU/XX X V VJJ. X I I.I <1TX X AVAVJ ^ 


SEQID 
NO:415 


7_1C4 


MIEVKPINAEDTYEIRHRILRPNQPLEACMYETDlXGG'ljr'H 

lggyyrgklisiasfhqaehpelegqkqyqlrgmatleey 
reqkagstlirhaeellrkkgadllwcnartsasgyykk 

T GE55EOGGVYDIPPIGPHILMYKKLT 


SEQID 
NO:416 


7_2A10 


MmVKPE^AEDTYEIRHRILRPNQPLEACKYETDLLGGibri 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
T OPSFflOfrVYDTPPIGPHILMYKKLT 

1, jVjr^t^ *-^Vj' vj VJ V i lyJUT x xvil xxxx-ulvx x x>.j.^ x 


SEQID 
NO:417 


7_2A11 


MIEVKPIN AEDTYEIRHRILRPNQPI£ACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
T f^FSFOOOVYDTPPVGPHILMYKKLT 

1 X II lit A/VJVl V XiyJJ. JL T VJ1 XIII J* ■ ■ ^ ■ < J. 


SEQID 
NO:418 


7_2D7 


MffiVKPIN AEDT YEIRHRDLRPNQPLEACKYETDLLGG tt*H 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKG ADLLWCN ARTS AS GYYKK 
T OFSFOfiFVYDTPPVGPHILMYKKLT 


SEQID 
NO:419 


7_5C7 


MIEVKPIN AEDTYEIRJHRILRPNQPIJ3ACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKVGSTLIRHAEELLRKKGADLLWCNARTS AS GYYKK 
T rJFSFnoOVYTjlRPVGPHIIJvrYKKLT 

I A ' V X XV XX X V VJ X XXXXjXtA x X^ ■ ~v « j jl 


SEQID 
NO-.420 


7_9C9 


MIEVKPINAEDTYEIRHRILRPNQPLEACMYETDLLGGlJbH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTL1RHAEELLRKKGADLLWCNARTSASGYYKK 
T GFSEOGEVYD1PPIGPHILMYKKLT 

1 A II UXiV^LJJ^ V X i/ XX X J.VJ1 XXJ-l <XTX -A. avavu _ _ 


SEQID 
NO:421 


9_13F10 


MffiVKTlNAEDTYEniHRILRPNQPLEACKYETDLXRGAFH 
LGGYYRGKLVSIASFHQAEHSELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
K1X3FSEQGEVYDIPPTGPHI1MYKKLT 


SEQID 
NO:422 


9_13F1 


MffiAKPLNAEDTYEIRjmLIJlPNQPLEACMYETDLLGGlbH 
LGGYYRGKLVSIASFHQAEHTELEGQKQYQLRGMATLEE 
YREQKAGSTLIRHAEELiRKKGADLLWCNARTSASGYYK 
jKLGFSEQGEVYDjTPVGPHLIMYKKLT 


SEQID 
NO:423 


9_15D5 


MIEVKPIN AEDTYEniHRILRPNQPLDACKYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKG ADLLWCNARTS AS GYYKK 
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1 


[ GFSFOGEVYDIPPVGPHILMYKKLT 


SEQID ! 
NO:424 


?_15D8 


MEVKPFN AEDT YEIRHRILRPNQPLEACMYETDLLGG 1 EH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEAIXRKKGADIXWCNARTSASGYYK 
KLGFSEQGEVYDTPPVGPHILMYKKLT 


SEQID 
NO:425 


9_15H3 


MHTVKPINAEDTYEIRHRIIJRPNQPLEACMYETO 
LGGYYRGKUSIASFHQAEHPELEGQKQYQLRGMATIEEY 
HEQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
T GPSEOGEVYNTPPVGPHII .MYKKLT 


SEQID 
NO:426 


9_18H2 


MffiVKPINAEDTYEIRHRILRPNQP]^ACMYETDLLGGlt<tl 
LGGYYRGKLISIASFHQAEHPELVGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKK 
LGFSEQGEVYDIPPVGPHILMYKKLT 


SEQID 
NO:427 


9_20F12 


MIEVKPINAEDTYEIRHRVLRPNQPIXACMYETDLLGGTF 
HLGGYYRGELVSIASFHQAEHPELEGQKQYQLRGMATLE 
GYREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYY 
KKLGFSEOGGVYDIPPVGPHIIJ^IYKKLT 


SEQID 
NO:428 


9_21C8 


MIEVKPIN AEDTYEIRHRILRPNQPIEACMYETDLLGGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAFELLJIKKGADLLWCNARTSASGYYKK 
t riPsnnGEVYnTPPVGPHELMYKKLT 


SEQID 
NO:429 


9_22B1 


MIEVKPmAEDTYEIRHRILRPNQPLEACKYETDLLGGlJeJtl 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YREQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
TCT GFSFOGEVYDLPPTGPinLMYKKLT 


SEQID 
NO:430 


9_23A10 


MEVKPINAEDTYEIRHRILRPNQPLEACKYETDLLGGTLH 
LGGYYRGKLVSIASFHQAEHPELEGQKQYQLRGMATLEG 
YRGQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYK 
KT GFSEOGGVYDH'PVGPHIUtfYKKLT 


SEQID 
NO:431 


9_24F6 


MIEVKPIN AEDTYEIRHRILRPNQPLEACKYETDLLRGAFH 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEALLRKKGADLLWCNARTSASGYYKK 
T GFSEOGEVYDIPPTGPHIIMYKKLT 


SEQID 
NO:432 


9_4H10 


MIEVKPIN AEDTYEIRHRILRPNQPLEACKYETDIXGGTLH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKGADLIWCNARTSASGYYKKL 
GFSFOGEVYDIPPVGPHILMYKKLT 


SEQID 
NO:433 


9_4H8 


MIEVKPIN AEDTYEIRHRILRPNQPLEACMYETDLLGGTFH 
LGGYYRGKLISIASFNQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKG ADLLWCNARTS AS GYYKK 
T rn^FnOFVYDTPPVGPHILMYlQKLT 


SEQID 
NO:434 


9_8H1 


MffiVKPITAEDTYEniHRILRPNQPLEACKYETDLLGGlJ-lli. 
GGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGYR 
EQKAGSTLIRHAEELLRKKGADLLWCNARTSASGYYKKL 
GFSEQGEVYDlPPTGPffllJvlYKKLT 


SEQDD 
NO:435 


9_9H7 


MffiVKPINAEDAYFJRHRILRPNQPI£ACKYETDLLGSlJr±i 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEEY 
REQKAGSTLIRHAEELLRKKG ADLLWCNARTS AS GYYKK 
LGFSEQGEVYDIPPVGPHILMYKKLT 


SEQID 


9C6 


MTF. WPTN A F.DTYF.TR HRILRPNOPLE ACMYETDLLGGlfil 
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NO:436 


I 
I 
1 


.GGYYQGKLISIASFHNAEHSELEGQKQYQLRGMATLEGY 

IEQKAGSTLIRHAEELLRKKGADLLWCNARTSVSGYYEK 

^GFSEQGEVYDIPPVGPHELMYKKLA 


SEQID < 
NO:437 


JH11 1 
] 

] 


VEEVKPINAEDT YEIRHRILRPNQPLEACKYETDLLGG 1 Jbli 
.GGYYRGKLISIASFHKAEHSELEGEEQYQLRGMATLEGY 
REQKAGSTLRHAEELLRKKGADLLWCNARTSVSGYYKK 
LGFSEQGEVYDIPPIGPHILMYKKLT 


SEQID ( 
NO:438 


3_4B10 ] 

: 


VOEVKPINAEDTYELRHKILRPNQPIEACMYESDLLRGAFH 
LGGFYRGKLISIASFHQAEHSDLEGQKQYQLRGMATLEGY 
EU3QKAGSTLKHAEEIUIKRGADMLWCNARTTASGYYKK 
LGFSEQGEIFDTPPVGPHILMYKRLT 


SEQID 
NO:439 


0_5B11 


MIEYKPINAEDTYELRHKILRPNQPIEACMYESDLLRGAFH 
LGGFYGGKLISIASFHQAEHSDLEGQKQYQLRGMATLEGY 
ElDQKAGSTLIKHAEQLLRKRGADIvILWCNARTSASGYYK 
KIXJFSEQGEVFETPPVGPHILMYKKrr 


SEQID 
NO:440 


0_5B3 


MLEVKPINAEDTYELRHRILRPNQPIEACMYETDLLRGAFH 
LGGFYRGKLISIASFHQAEHSELQGQKQYQLRGMATLEGY 
RDQKAGSSLIKHAEQLLRKRGADLLWCNARTSASGYYKK 
LGFSEQGEVFDTPPVGPHILMYKRrr 


SEQID 
N0441 


0_5B4 


MIJBVKLINAEDTYELRHRILPvPNQPLEACMYETDLLRGAF 
HLGGFYRGKLISIASFHQAEHSDLEGQKQYQLRGMATLEG 
FRDQKAGSSLnCHAEEILRKRGANLLWCNARTSASGYYKK 
LGFSEQGEVFDTPPVGPHILMYKRrr 


SEQID 
NTO-442 


0_5B8 


MffiVKPINAEDTYELRHKILRPNQPffiACMYESDLLRGAFH 
LGGFYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGY 
RDQKAGSSLIRHAEQILRKRGADLLWCNARTSASGYYKK 
LGFSEQGEIFDTPPVGPHILMYKRLT 


SEQID 

1 1>V/»ttJ 


0_5C4 


MIEVKPINAEDTYELRHBCni.RPNQPLEACMYETDLLRGAF 

HLGGFYRGKLISIASFHQAEHSGLQGQKQYQLRGMATLEG 

YREQKAGSSHKHAEEILRKKGADLLWCNARTSASGYYKK 

LGFSEQGEIFDTPPVGPHILMYKRIT 


SEQID 


0_5D11 


MIEVKPINAEDTYELRHRILRPNQPIEACMYESDLLRGAFH 
LGGFYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEQLLRKRG ADLLWCN ARTS ASG Y YKR 
LGFSEQGEVFDTPPVGPHILMYKRLT 


SEQID 


0_5D3 


MLEVKPINAEDTYELRHRILRPNQPIEACMYESDLLRGAFH 
LGGYYRGKLISIASFHQAEHSELQGQKQYQLRGMATLEGY 
REQKAGSSIJKHAEEILRKRGADLLWCNARTSASGYYKKL 

GFSEQGEIFETPPVGPHILMYKRIT 


SEQID 
1 NO -446 


0_5D7 


MffiVKPINAEETYEIJElHRILRPNQPffiACMYETDLLRGAFH 
LGGFYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
RDQKAGSSLIRHAEQLLRKKGANMLWCNARTTASGYYK 
KLGFSEQGEIFDTPPVGPHILMYKRrr 


SEQID 
NO:447 


0_6B4 


MLEVKPESTAEDTYELRHRILRPNQPIEACMYESDLLRGALH 
LGGFYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGF 
RD QKAGS SLIRHAEQILRKRG ADLLWCN ARTS AS G YYKK 
LGFSEQGKVFDTPPVGPinLMYKRIT 


SEQID 
NO:448 


0_6D10 


MIJEVKPrNAEDTYELRHKlLRPNQPIEVCMYETDLLRGAF 

HLGGFYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEG 

YRDQKAGSSLIRHAEQILRKRGADMLWCNARTSASGYYK 
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KT nRSPOfrFVPRTPPVGPHELMYKRLT 

l\ 1 A II ill A^/V II i V X ' ' 1 1 X V \JA XII I Jl VX X 1V1V1-/1 


SEQID 
NO:449 


0_6D11 


MIEVKPINAEDTYELRHRILRPNQPIEACMYESDLLRGAFH 
LGGYYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGF 
RDQKAGS SL1EHAEQILRKRGADLLWCN ARTS ASG YYKK 
GFSEOGEVFETPPVGPHILMYKPJT 

jVJl F -'V^ " * ' V X X-i XXX T VJX X XXI JiTl X I>1 ^J. -■- 


SEQID 
NO.-450 


0_6F2 


MIEVKPINAEDTYELJRHRIIJRPNQPIEACMYESDLLRGAFH 
LGGYYRGKLISIASFHQAEHSELQGQKQYQLRGMATLEGF 
REQKAGSTLJPJEIAEQILRKRGADMLWCNARTSASGYYKK 
T GKSFOGFTFDTPPVGPHlLMYKRrr 

_y\ PI t 'I A^/V 1 | j 1 1 t ■/ XXX V V_JX X 1,1.1 <l VX X XVXVX X 


SEQID 
NO:451 


0_6H9 


MffiVKPFNAFJDTYELRHKILRPNQPffiACMYETDLLRGAFH 
LGGFYGGKLISIASFHQAEHSDLEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEEDJEiKKGAJSrLLWCNARTSASGYYKKL 
GFSFOGFVFDTPPVGPHIUtfYIGRLT 

VJI ijJjV^/vJJL/ V X X-/ XXX V VJX X 1 1 1 4i tX X ivivLj x 


SEQID 
NO:452 


10_4C10 


1V0EVKPFNAEDTYEUIHKILRPNQPLEVCMYETDLLRGAF 
HLGGXYRGKLISL\SEHQAEHSELQGQKQYQLRGMATLEG 
YRDQK AGS S LIKHAEQILRKRG ADXLWCN ARTS AS G YYK 

XNJjVJJTO HjV^/ *J XltXX XJ X XT X V VJJT X I 1 1 ilVX X — * X 


SEQID 
NO:453 


10_4D5 


MIEVKTINAEDTYELR^ 

LGGFYRGIOLISIASraQAEHSDLQ^ 

I^QKAGSTIJKJxAEQIUIK^ 

rvp ct7 r\ rip? VTTTiTPP VfrPTTTT "MT VKT^TT 
vjrro Jjiv^vJTC* v x\l* a it it v vjjt xtxl*iyx x a 


SEQID 
NO:454 


10_4F2 


MIJiVKPINAEDTYEI^^ 

LGGx^RGKLISIASmQAEHSELQGQKQYQLRGI^Tl^EGY 
REQKAGSSLIRHAEEIlJll^ 

i a Tr^rA^ thi rr. 1 r i v vjitxi 1 1 ^ivx x isjvlji 


SEQID 
NO:455 


10_4F9 


NDEVKPINAEDTYEIJRHRILRPNQPffiVCMYETDIXRG 
LGGFx^GKEISIASFHQAEHSEIXJGQKQYQLRGMATLEGF 
REQKAGSSEH1HAEQILRKRGADLLWCNARTSASGYYKKL 
OP5Fnr}FrFr>TPPVGPTTTT MYTCRLT 

v t ii 1 1 a jv.ji jii i_j ill v vjr xujulvi i x\x\x^i 


SEQID 
NO:456 


10_4G5 


MIEVKPINAEIJTYFXIIHRILRPNQPIEACMFESDLERGAFH 
LGGYYRGKLISIASFHQAEHSDLQGQKQYQLP.GMATLEG 
YRDQKAGS SLIR1TAEQIERKRG ADLLWCN ARTS AS GYYK 
TCI OFSFnOFTFDTPPVGPHlLJVlYTCRLT 

|\- 1 A 11 1 livVVJ.I \a\ l.U X X X V VJX XXXX-xLVX X XVXVX-f X 


SEQID 
NO:457 


10_4H4 


MIJBViaPlN^ 

x^GGFYRGia.ISIASmQAEHSELQGQKQYQLRGMATLEG 

YREQKAGSSIJKHAEEILRI^^ 

t r^PQprir^PVTTnTPPvrTPTTTT mytcrtt 


SEQID 
NO:458 


11_3A11 


MffiVKPINAEDT^ 

LGGl^GKI.ISIASmQAEIIPDLQGQKQYQLRGIvlATI^GY 

RDQKAGSSLIKHAEQILJIKRGAD1X 

T rtln^priOP VTTFTPPVnPTITT "MYTCRT !T 

X^\jr&I2i\J\J±2i V X^Xj X XX V vJX X I 1 1 «IVX X XVXVX-* X 


SEQ ID 
NO:459 


11_3B1 


MIJEVKPINAEDTYELRlTOILRPNQPffiACMFETDLERGAFH 
LGGFYRGKlJSIASFHQAFiiSDLQGQKQYQIJRGMATLEGF 
MQKAGSTLIRHAEEILRKRGADLLWCNARTSASGYYICRL 
GFSEOG13FDTPPVGPH1LMYKRLT 


SEQID 
NO:460 


11_3B5 


MIEVKPINAFJDTYELJRHRILRPNQPIEACMFESDLLRGAFH 
LGGYYRGKLISIASFHQAEHSELQGQKQYQLRGMATLEGY 
RDQKAGSSIJRHAEQILRKRGADMLWCNARTSASGYYKK 
LGFSEQGEVFDTPPVGPHILMYKR1T 


SEQID 


11_3C12 


MIEVKPINAEDTYELRHRILRPNQPLEVCMYETDLLRGAFH 
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NO:461 




LGGFYGGKLISIASFHQAEHPDLQGQKQYQLRGMATLEGY 
RDQKAGSSLIRHAEQLLRKRGADLLWCNARTSASGYYKK 

T r^T7CTHOO"RrRT7TPPA/riPTTTT AAYTTPTT 
JbUrr'oJZrl^OrnJLr'jj * » OJrXli-LiiVl I JSJKJL 1 


SEQID 
NO:462 


11_3C3 


MffiVKPINAEDTYELRHEOLRPNQPIEACMYESDIiRGAIJH 
LGGYYRGKLISIASFHQAEHSELQGQKQYQLRGMATLEGY 
REQKAGSSLIBGIAEEILRKRGADLLWCNARTSASGYYKKL 

riTHQTHO<^T7A/T7T^TPP\/OPTTTT TV/TVT^PTT 
IjrJro Jll^vJx} V VxJ 1 Jrir V \jr JdJLLJVl I JSJtvI I 


SEQID 
NO:463 


11_3C6 


MIEVKPlNAEDTYELRHKimPNQPffiACMFESDLLRGAFH 
LGGFYGGKLISIASFHQAEHSDLEGQKQYQLRGMATLEGY 
REQKAGSTLIEIHAEEILRKRGADLLWCNARTSASGYYKKL 

r^TJgT?nnT? 1 U\ VTPP\/r^PTTTT A/TVl^PTT 


SEQID 
NO:464 


11_3D6 


MffiVKPINAEDTYELRmiLRPNQPffiVCMYETDLLRGAFH 
LGGFYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGY 
REQKAGS SLKHAEQILRKRGADLLWCNARTS AS GYYKKL 
riinCT7oriT7^/T7riTrpp\/rjPTTrT A/TVH&tpt t 


SEQID 
NO:465 


1_1G12 


MLEVKPINAEDTYELRHRILRPNQPIEVCMYETDLLRGAFH 
LGGFYGGKLISIASFHQAEHSELQGQKQYQLRGMATLEGY 
RDQKAGSSLJKHAEEILRKRGADLLWCNARTSASGYYKKL 
LxroxHjOJi V rrl 1 rr V Olrrtu-JVl x JS-K-L 1 


SEQID 
NO:466 


1_1H1 


MffiVKPlNAEETYEUlHKELRPNQPIEACMYESDLLRGSFH 
LGGFYRGQLISIASFHKAEHSELQGQKQYQLRGMATLEGF 
REQKAGSSLIRHAEEELRNKGADLLWCNARTTASGYYKRL 
(jrFo EHCjE V rH 1 FF V OFrlLLJVl I JsJ<l 1 


SEQID 
NO:467 


1_1H2 


MIE VKPINAEDT YELRHRILRPNQPLE ACMYES DLLRGSFH 
LGGFYRGKLISIASFHQAEHSEIJEGQKQYQLRGMATIEGF 
REQKAGS SLIRHAEEELRKRG ADLLWCNARTTAAGYYKK 
l^OJroii^OJiJUrJJ 1 JrJr V vjJrxlLLJVl x JvKi I 


SEQID 
NO:468 


1_1H5 


MIEVKP1NAEDTYEIRHRILRPNQPLEACMYESDLLRGSFH 
LGGFYRGKLISIASFHQAEHSDLEGQKQYQLRGMATLEGY 
RDQKAGSSLIRHAEQ1LRKRGADLLWCNARTTAAGYYKR 
ivVjrorWJ LrH VrJJ 1 FF V LiFJtlLLJVl I JVJvL 1 


SEQID 
NO:469 


1_2A12 


MIEVKPINAEDTYELRHPaLRPNQPffiACMYESDLLRGSFH 
LGGFYRGKLISIASFHQAEQSELEGQKQYQLRGMATLEGY 
RDQKAGSTLIKHAEEILRKKG ADLLWCNARTS A AG YYKR 
LOFoJivjLrillrjJ 1 FF V OFriJULIVl Y JsJKJL, L 


SEQID 
NO:470 


1_2B6 


MffiVKPlNAEETYELRHKILRPNQPLEACMYETDLLRGSFH 
LGGFYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGF 
PJDQKAGSSLKHAEElIJtKRGADLLWCNARTSASGYYKKL 
LrJroiivJOiilrii 1 r ir V Lwr J±LLM Y JvKL 1 


SEQID 
NO:471 


1_2C4 


MIJEVKPINAEETYELRHKILRPNQPIEACMYETO 
LGGFYRGQLISIASFHQAEHSDLQGQKQYQLRGMATLEGY 

LGFSEQGEVFDTPPVGPHLLMYKKJT 


SEQID 
NO:472 


1_2D2 


MIEVKPINAEDTYELRHKILRPNQPLEACMYESDLLRSAFH 
LGGFYRGKLISIASFI1KAEHSELQGQKQYQLRGMATLEGY 
RDQKAGSSlJKHAEEIIJtKRGADMLWCNARTSAAGYYKR 
LGFSEQGEVFDTPPVGPH1LMYKR1T 


SEQID 
NO:473 


1_2D4 


MffiVKPINAEDTYELRHRILRPNQPffiACMYESDLLRGSFH 
LGGFYRGKUSL^SFHQAEHSDLQGQKQYQLRGMATLEGY 
REQKAGS SLHCHAEOLLRKKG ADMLWCNARTS A AGYYK 
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XvL-O Jr o IiXlLJXlXX , Xi 1 rr V KJtr rlXJ^JLVX I JSJxJLX 


SEQID 
NO:474 


1_2F8 


MIEYBCPE^AEDTYELRERILRPNQPLEACMYETDIJLRGSF 
HLGGFYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEG 
YRD QKAGS S LIRHAEEILRKRG ADMLWCNARTTA AG Y YK 

JSJUVjlroriV<?^J-L-«l I XJ 1 rr V Or\nJXJVl I JSJSJ-, 1 


SEQID 
NO:475 


1_2H8 


MIEVKPINAEETYELRHKILRPNQPIJEACMYETDLLRGAFH 
LGGFYRGKLISIASFHQADHSELQGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEQ1LRKRGADLLWCNARTSAAGYYKK 

J-( vjr Jr o xixlVjrxiix i Xw 1 rr V wJrxiLLflVl I XSJvL- 1 


SEQID 
NO:476 


1_3A2 


MIEVKPINAEDTYELRHRILRPNQPIEACMYESDLLRGAFH 
LGGFYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGY 
REQKAGS SLIRHAEEILJIKKG ADMLWCN ARTTAAGYYKR 

T f^T7CT?r%riT5\7T7r* r rPP\7'riPT4TT TV/TVVP TT 


SEQID 
NO:477 


1_3D6 


MIEVKPINAEDTYEIJ^HKIIJRPNQPIEACMYESDLLQGSFH 
LGGFYRGQLISIASFHQAEHSDLQGQKQYQLRGMATLEGF 
REQKAGSTLIKHAEEILRKKG ADLLWCN ARTS A AG Y YKK 

LUr o Xiil^CiLrlJ I rr ALrrxl 1 1 4IVL x JsJvL, 1 


SEQID 
NO:478 


1_3F3 


MIEVKPINAEETYELRQRILRPNQPIEACMYESDLLRGSFHL 
GGFYRGQLISIASFHQAEHSELQGQKQYQLRGMATLEGYR 
EQKAGSTLIKHAEEELRKKGADLLWCNARTSAAGYYKRL 
LrroJbixlijrSlrlJ I rr V (ji J i±LLM Y KK1 1 


SEQID 
NO:479 


1_3H2 


MEVKPINAEDTYl^^ 

LGGYYRGQLISIASFHKAEHSELQGQKQYQIJRGMAT^ 
X^QKAGSTLlimAEQLLl^KGADMLWCNARTSAAGY^ 

PT /TnCUOfJTTX/tTI WDD\7m>TXTT T 

KJ^wJrolil^LrrS V VL) X .Kr V (jJrXi 1 1 ,M I JsJsJL 1 


SEQID 
NO:480 


1_4C5 


MEVKPINAEDTYE^ 
LGGFYRGKUSIASFHI^^ 

REQKAGSTLIRHAEEll^RKRGADlVil.WCNART^ 
-Lvjr iJlrlLrxiJLriJ I rr V 1 ifn 11 -ivi I JKJvL 1 


SEQID 
NO:481 


1_4D6 


MLEVKPINAEDTYELRHPJLRPNQPIEAC]VrYETDLLRGSFH 
LGGFYRGQLISIASFHKAEHSDLEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEQILRKRGADMLWCNARTSAAGYYKR 
Lur oriy «jui V.rri 1 rr V orrllLM. Y .KJKL. 1 


SEQID 
NO:482 


1_4H1 


MIEVKPINAEDTYELRHRILPJPNQPLEACMYETDIXRGSFH 
LGGFYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEQLLRKRGADLLWCNARTSASGYYKR 


SEQID 
NO:483 


1_5H5 


MLEVKPnvFAEETYELRHKELRPNQPLEACMYESDLLRGSFH 
LGGYYRGQLISIASFHQAEHSELEGQKQYQLRGMATLEGF 

REQKAGSTLIKHAEQILRKRGADMLWCNARTSAAGYYKK 

t nT7CT?unp i nr\ r rpp'\7/*^pxjn r T tv twit t 1 
lAir 1 5 JtiWLiJilrJJ 1 rr V LrrhLLLJVL Y JsJsJL 1 


SEQID 
NO:484 


1_6F12 


GGFYRGKLISIASmQAEHSDLEGQKQYQLRGMATLEGYR 

DQKA.GSTIJKHAEELLRKRGADMLWaSfARTSAAGYYKR 

LGFSEHGEIYETPPVGPHILMYKKIT 


SEQID 
NO:485 


1_6H6 


MIEVKPINAEDTYELRHKILJIPNQPIEACMYESDIJJIGSFH 
LGGFYRGQLISIASFHQAEHSDLEGQKQYQLRGMATLEGY 
RDQKAGS SLIKHAEEILRKRGADLLWCNARTS AAGYYKR 
LGFSEQGEIFDTPPVGPHILMYKKIT 


SEQID 


3_11A10 


MLEVKPESAEDTYELRHRIIJtfNQPffiACMYESDIXRGAFH 
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NO:486 




LGGYYRGKLISIASFHQAEHSELQGQKQYQLRGMATLEGY 
REQKAGSSLVKHAEEILRKRGADLLWCNARTSASGYYKK 

LAjr oiiy tailirxi 1 rr V OJrJnLLL»JVl I JSJvi 1 


SEQID 
NO.-487 


3_14F6 


MLEVKPINAEDTYELRHRDJIPNQPffiACMYESDLLRGAFH 
LGGFYRGKLISIASFHQAEHSELQGQKQYQLRGMATLEGY 
REQKAGSSLIKHAEEILRKRGADLLWCNARTSASGYYKKL 
(jr o xll^ w JtSJLr xi I rr V IxrJn 1 1 /M. x JvKJL 1 


SEQID 
NO:488 


3_15B2 


MI^VKPmAFJDTYELRHKILRPNQPLEVCMYETDLLRGAF 
HLGGYYGGKLISIASFHQAEHSELQGQKQYQLRGMATLE 
GYREQKAGSSLIKHAEEILRKRGADLLWCNARTSASGYYK 
KL.vjrr oEy oElrJb 1 Fir V vjrJr JHJLLM Y KJK1 I 


SEQID 
NO:489 


3_6A10 


MEVKPEsTAEDTYELRHRILRPNQPffiACMYESDLIJRGAm 
LGGYYRGKLISIASFHQAEHSEIXJGQKQYQLRGMATLEGY 
REQKAGS SIJCKHAEEILRKRG ADLLWCN ARTS AS G YYKKL 
CjrollQljrJhllhJi 1 FF VOFH1LM YKR1T 


SEQID 
NO:490 


3_6B1 


MIEVKPINAEDTYELRHEULRPNQPIEACMYESDLLRGAFH 
LGGYYRGKLISIASFHQAEHPELQGQKQYQLRGMATLEGY 
MQKAGSSLIKHAEEILRKRGADLLWCNARTSASGYYKKL 
OrbliQOJaVrJilFFVGFHIlJviyJvRIT 


SEQID 
NO:491 


3J7F9 


MI^VKPINAEDTYELPJEIRIUlPNQPffiACMYESDLLRGAFH 
LGGYYGGKLISIASFHQAEHSDLQGQKQYQLRGMATLEG 
YREQKAGS SLIKHAEEILRKRG ADIXWCNARTS AS GYYKK 
jLOraEQGElFEvrPPVGPHILMYK^ 


SEQID 
NO:492 


3_8G11 


MLEVKPINAEDTYELPJHRILIU'NQPffiVCMYESDLIJIGAFH 
LGGYYRGKLISIASFHQAEHSELQGQKQYQLRGMATLEGY 
PJEQKAGSSLLKHAEEILRKRG ADIXWCNARTS AS G YYKKL 
CjFaxlQGjblFlll FFVoFjHIjLMYKRIT 


SEQID 
NO:493 


4_1B10 


MffiVKPLNAEDTYELRHRIURPNQPffiVCMYETDLLRGAFH 
LGGFYGGKUSIASmQAEHSDLQGQKQYQLRGMATLEGY 
RDQKAGSSLIRHAEQILRKRGADMLWCNARTSASGYYKK 
LGFSEQGEIFETPPVGPHD^MY 


SEQID 
NO:494 


5_2B3 


MIEVKPINAEDTYELRHRILRPNQPLEVCMYETDLLRGAFH 
LGGFYGGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGY 
RDQKAGSSLIRHAEQILRKRGADMLWCNARTSASGYYKK 

T /^TJflT!/^ /"II Tl'¥ ■«» y HIVTVT T/ — (T>TTTT TV JTK.t'tJ'' 1 > 

i^GrbEQGEIEEIrPVGPHILMYKRIT 


SEQID 
NO:495 


5_2D9 


MLXVKPINAEDTYELRHKILRPNQPXEVCMYEXDl^ 
HLGGFYRGKLISIASFHQAEHSDLQGQKQYQOIGMATLEG 
YRDQKAGSSLIKHAEQIIJRERGADMLWCNARTSASGYYK 
JCL(jrJhoEQGEVrDTPPVGPHILM 


SEQID 
NO:496 


5_2F10 


MLEVKPINAEDTYELRHKILRPNQPIEVCMYETDLI^ 
HLGGFYGGKLISIASFHQAEHSDLQGQKQYQLRGMATLEG 
i xvi^v^is^oooi^ixvn^\J2iV^Xl^^ 1 oAou x x Jv 
KLGFSEQGEH^PPVGPHIIJSIYKRLT 


SEQID 
NO:497 


6_1A11 


Ml^VKPINAEDTYELRHKILRPNQPLEVCM 

HLGGFYRGKLISIASFHQAmSDLQGQ 

YRD QKA GS S LIRHAEQILRKRG ADMLWCN ARTS AS G Y YR 

KLGFSEQGEVFETPPVGPHIIJVI^KRLT 


SEQID 
NO:498 


6_1D5 


MLEVKPllNrAEDTYEUlHKlI.RPN^ 

HlJ3GFYRGKLISIASmQAEHSDLQGQKQ^ 

YRDQKAGSSLIRHAEQILRKRGADMLWCNAR 
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TCI OF^FHrrFVPFTPPVCrPTTTT MYTCRTT 


SEQJD 
NO:499 


6_1F11 


MIEVKPINAEDTYFXRHKIIJ^N^ 

HLGGFYRGKLISJASFHQAEHSDLQGQKQYQI^GMATLEG 
YMQKAGSSLIRHAEQ1LRKRGADMLWCNARTS AS GYYK 
KT OFSFOnFVFFTPPVGPHTLA/TYTCRT T 

XVJLwVJ J. kjJLiV^UU V X JLv XXX V VJX XIII .<1VX X JLVIVLj X 


SEQID 
NO:500 


6_1F1 


MLEVXPINAEDTYELimKIIJIPNQPLEVCMYETDLLRGAF 
HIGGFYRGKLKIASFHQAEHSELQGQKQYQLRGMATLEG 
YRDQKAGS S LIRHAEQILRKRG ADMLWCN ARTS AS GYYK 
KT GFSFOOFVFFTPPVfrPT-TTT MYTCRT T 

X> 1 ;VJl Ji-jy^vlJu V X X-*X XXV Ul XXXJLflVX X XVXVX-/ X 


SEQID 
NO:501 


6_1H10 


MIJEVKPINAEDTYELRHKEJU^NQPLEVCMYETDLLRGAF 
HLGGFYGGKLISIASFHQAEHSDLQGQKQYQLRGMATLEG 
YRDQKAGSSLEIHAEEILRKRGADMLWCNARTSASGYYK 
KLGFSEOGEVFDTPPVGPHjT mykktt 

XVI VJX OXvV^WXv V X XV XXX Y VJX XXXXwlVX X xvxvxx 


SEQID 
NO:502 


6_1H4 


MI£VKPINAEDTYELRHKILRPNQPLEVCMYETDLLRGAF 

HLGGFYGGKLISIASFHQAEHSDLQGQKQYQLRGMATLEG 

YRDQKAGSTLJKHAEQ1LRKRG ADMLWCN ARTS AS GYYK 

KT CrFSFOfrFV 1 'PPVOPTTTT MVTTRT T 
xvXjvjx o xj*\£ *-JXj v x xj X x x v v ir n n .ivi x xvxvXj x 


SEQID 
NO:503 


8_1F8 


MIEVKPINAEDTYELPJEx^^ 

LGGFYRGKLISIASFHQAEHSDLQGQKQYQLRGMATLEGY 
REQKAGS SIJDKHAEEILRKRG ADLLWCN ARTS ASG YYKKL 

(TF^FOfrFiFnTPPVOPWiT A/TVKTJTT 
^~jx\xXj\j^jx-jXX \Ly x xrr v urn 1 1 <ivx I IVIvX x 


SEQID 
NO:504 


8_1G2 


MIEVCTINAEDT 

l^GGYYRGKLISIASmQAEHSELQGQKQYQIJR.GMATLEG 

YREQKAGSSLIiaxAEElIJRKR 

T frF^FOfrF V FFTPPVOPHTT A/TY"R*PT T 

X^\JX^OX-/V^vJXJ/ V X X_» X XX V VJi XjJJLaIVX X XVXVX-» A 


SEQID 
NO:505 


8_1G3 


IVD^X^INAEDTYELxvH^^ 

HLGGYYRGKIJSIASraQ 

YlvEQl^GSSLIlvHA^ 

XwVjX^Ox^V^VJJiJLIrJL/ X xx V vJx XXJLLlVx I xvxvx X 


SEQID 
NO:506 


8_1H7 


MLEVl^xNAEDTYEIJEvHM 

LGGFYRGICLISIASI^QAEHSELQGQKQYQL^ 

REQKAGSSLIimAEEILRI^ 

X^VJX OX^>s^V_TXiXX^X_» X XX V VJX XXXX-tlYX I XVxXXj X 


SEQID 
NO:507 


8_1H9 


IVlI^VKPINAEDTYELlxHKI^ 

HLGGYYRGKLISIASraQAEHSDLQGQKQYQLRGMATLE 
GYlvEQKAGSSLlRHAEEILxvIvRGADx^ 

TCI frF^FOrrFVFnTPPVOPWTT MVYPT T 
iVIjUToxiyUXi VrL/ 1 XX V KJJc xl 1 LilVx X xvivx-ix 


SEQID 
NO:508 


GAT1_21F 
12 


MIEVKPINAEDTYEIPJxRIIJRPNQPIJEACKYETDIXGGTFH 

LGGYYRGKLISIASFHNAEIISELEGQKQYQLRGMATLEGY 

REQKAGSTLIRHAEEIXRKKGADLLWCNARTSVSGYYKK 

T OFQFOriFVVT^TPPTri'D'HTT A/TVFin T 
x^vJ-T o Xl\£ OrJtl V I iJix^JrlOx xxJLLlVl i JSJSJL 1 


SEQID 
NO:509 


GAT1_24G 
3 


MffiVKPINAEDTYEIIvE^^ 

LGGYYRGKLISIASFHQAEHSEIJEGQKQYQLRGMATLEGY 

REQKAGSTLIRHAEELLRKKGADLLWCNARTFVSGYYEK 

LGFSEQGEVYDIPPIGPY1LMYEKLT 


SEQID 
NO:510 


GAT1_29G 
1 


MTEVKPINAEDTYEIRHRII^NQPl^ACMYETDLLGGTFH 
LGGYYRGKEIS1ASFHQAIMSEIJEGQKQYQLRGMATLEGY 
REQKAGSTL1RHAEFJJLRKKGADIXWCNARTSVSGYYKK 
LGFSEQGGVCDIPPIGPHILMYKKLA 


SEQID 


GAT1_32G 


MIEVKPINAEDTYEIRHRILRPNQPEEACMYETDIXGGTF^ 
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NO:511 


1 


LGGYYRGKIJSIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLERHAEELLRKKGADLLWCNARTSVSGYYEK 
T fiFSFOOFVYDTPPTGP'HiT MVKKT T 

.L/VJX OJ— jV^ vJXJ/ V X X-/XX -L l\Jx XXX 1 /1YX A JViVLfl 


SEQID 
NO:512 


GAT2_15G 
8 


MIEVKPINAEDTYErRJHIlIIJRPNQPLEACKYETDLLGGTFH 
LGGYYRGKLISIASFHNAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAERT T .RKKGADLLWCNARTSVSGYYKK 
T fH^FOOFVYn 1PPTOPHTT MWKT T 


SEQID 
NO: 5 13 


GAT2_19H 
8 


MIE\^INAEDTYEIPJ1RILRPNQPLEACMYETDU^GGTFH 
LGGYYRGKLISIASFHQAEHPELEGQKQYQLRGMATLEGY 
REQKAGSTLIRHAEELLRKKG ADLLWCNARTS VS GYYEK 
I GFSEOGEVCDTPPTGPHTT MYKKLT 


SEQID 
NO:514 


GAT2_21F 
1 


MIEVKPlNAEDTYEIRHPJLRPNQPIEAClvrxTTDIXGGTm 
LGGYYRGKLISIASFHQAEHSELEGQKQYQLRGMATLEGY 
REQKAGSTLmHAEELLRKKGADLLWCNARTSVSGYYKK 
LGFSEQGGVYD1PPIGPHILMYKKLT 


SEQID 
NO:515 


B. 

licheniforni 
is ribosome 
binding site 


AACTGAAGGAGGAATCTC 



- 197- 



WO 02/36782 



PCT/US01/46227 



WHAT IS CLAIMED IS: 

1. An isolated or recombinant polynucleotide comprising: 

(a) a nucleotide sequence encoding an amino acid sequence that can be optimally aligned 
with a sequence selected from the group consisting of SEQ ID NO:300, SEQ ID 
NO:445 and SEQ ID NO:457 to generate a similarity score of at least 430, using the 
BLOSUM62 matrix, a gap existence penalty of 1 1, and a gap extension penalty of 1; 
or 

(b) a complementary nucleotide sequence thereof. 

2. The isolated or recombinant polynucleotide of claim 1, wherein the 
polypeptide has glyphosate N-acetyl transferase activity. 

3. The isolated or recombinant polynucleotide of claim 2, wherein the 
polypeptide catalyzes the acetylation of glyphosate with a kcat/Km of at least 10 mM" 1 
min" 1 for glyphosate. 

4. The isplated or recombinant polynucleotide of claim 2, wherein the 
polypeptide catalyzes the acetylation of aminomethylphosphonic acid. 

5. An isolated or recombinant polynucleotide comprising a nucleotide 
sequence encoding a polypeptide having glyphosate N-acetyltransferase activity, the 
polypeptide comprising an amino acid sequence comprising at least 20 contiguous amino 
acids of an amino acid sequence selected from the group consisting of SEQ ID NO:300, 
SEQ ID NO:445 and SEQ ID NO:457. 

6. The isolated or recombinant polynucleotide of claim 5, wherein the 
polypeptide comprises an amino acid sequence comprising at least 50 contiguous amino 
acids of an amino acid sequence selected from the group consisting of SEQ ID NO:300, 
SEQ ID NO:445 and SEQ ID NO:457. 

7. The isolated or recombinant polynucleotide of claim 5, wherein the 
polypeptide comprises an amino acid sequence comprising at least 100 contiguous amino 
acids of an amino acid sequence selected from the group consisting of SEQ ID NO:300, 
SEQ ID NO:445 and SEQ ID NO:457. 

8. The isolated or recombinant polynucleotide of claim 5, wherein the 
polypeptide comprises an amino acid sequence comprising about 140 contiguous amino 
acids of an amino acid sequence selected from the group consisting of SEQ ID NO:300, 
SEQ ID NO:445 and SEQ ID NO:457. 
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9. The isolated or recombinant polynucleotide of claim 5, wherein the 
polypeptide comprises an amino acid sequence selected from the group consisting of SEQ 
ID NO:300, SEQ ED NO:445 and SEQ ID NO:457. 

10. The isolated or recombinant polynucleotide of claim 5, comprising a 
nucleotide sequence selected from the group consisting of SEQ ID NO:48, SEQ ID 
NO:193 and SEQ ID NO:205. 

11. The polynucleotide of claim 1, wherein a parental codon has been 
replaced by a synonymous codon that is preferentially used in plants relative to the 
parental codon. 

12. The polynucleotide of claim 1, further comprising a nucleotide 
sequence encoding an N-terminal chloroplast transit peptide. 

13. A non-native variant of the polynucleotide of claim 1, wherein one or 
more amino acids of the encoded polypeptide have been mutated. 

14. A nucleic acid construct comprising the polynucleotide of claim 1. 

15. The nucleic acid construct of claim 14, comprising a promoter 
operably linked to the polynucleotide of claim 1, where the promoter is heterologous with 
respect to the polynucleotide and effective to cause sufficient expression of the encoded 
polypeptide to enhance the glyphosate tolerance of a plant cell transformed with the 
nucleic acid construct. 

16. The nucleic acid construct of claim 14, wherein the polynucleotide 
sequence of claim 1 functions as a selectable marker. 

17. The nucleic acid construct of claim 14, wherein the construct is a 

vector. 

18. The vector of claim 17 comprising a second polynucleotide sequence 
encoding a second polypeptide that confers a detectable phenotypic trait upon a cell or 
organism expressing the second polypeptide at an effective level. 

19. The vector of claim 18, wherein the detectable phenotypic trait 
functions as selectable marker. 

20. The vector of claim 19, wherein the detectable phenotypic trait consists 
of herbicide resistance, pest resistance, or a visible marker. 

21. The vector of claim 17, wherein the vector comprises a T-DNA 

sequence. 

22. The vector of claim 17, wherein the polynucleotide is operably linked 
to a regulatory sequence. 
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23. The vector of claim 17, wherein the vector is a plant transformation 

vector. 

24. An isolated or recombinant polynucleotide comprising: 

(a) a nucleotide that hybridizes under stringent conditions over substantially the entire 

5 length of a nucleotide sequence that encodes an amino acid sequence selected from the 

group consisting of SEQ ID NO:300, SEQ ID NO:445 and SEQ ID NO:457; 

(b) a complementary nucleotide sequence thereof; or 

(c) a fragment of (a) or (b) that encodes a polypeptide have glyphosate N- 
acetyltransferase activity 

10 25. The polynucleotide of claim 24, comprising a nucleotide sequence that 

encodes a glyphosate N-acetyl transferase. 

26. A composition comprising two or more polynucleotides of claim 1. 

27. The composition of claim 26 comprising at least ten polynucleotides of 

claim 1. 

15 28. A cell comprising at least one polynucleotide of claim 1, wherein the 

polynucleotide is heterologous to the cell. 

29. The cell of claim 28, wherein the polynucleotide is operably linked to a 
regulatory sequence. 

30. A cell transduced by the vector of claim 17. 

20 31. The cell of claim 28 or 30, wherein the cell is a transgenic plant cell. 

32. The transgenic plant cell of claim 31, wherein the plant cell expresses 
an exogenous polypeptide with glyphosate N-acetyl transferase activity. 

33. A transgenic plant or transgenic plant explant comprising the cell of 

claim 32. 

25 34. The transgenic plant or transgenic plant explant of claim 33, wherein 

the plant or plant explant expresses a polypeptide with glyphosate N-acetyl transferase 
activity. 

35. The transgenic plant or transgenic plant explant of claim 34, wherein 
the transgenic plant or plant explant is a crop plant selected from among the genera: 

30 Eleusine, Lollium, Bambusa, Brassica, Dactylis, Sorghum, Pennisetum, Zea, Oryza, 
Triticum, Secale, Avena, Hordeum, Saccharum, Coix, Glycine and Gossypium. 

36. The transgenic plant or transgenic plant explant of claim 34, wherein 
the transgenic plant or plant explant is Arabidosis. 
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37. The transgenic plant or transgenic plant explant of claim 34, wherein 
the transgenic plant or plant explant is Gossypium. 

38. The transgenic plant or transgenic plant explant of claim 34, wherein 
the plant or plant explant exhibits enhanced resistance to glyphosate as compared to a wild 

5 type plant of the same species, strain or cultivar. 

39. A seed produced by the plant of claim 34. 

40. A transgenic plant which contains a heterologous gene which encodes 
a glyphosate N-acetyltransferase having a kcat/Km of at least 10 mMT 1 min' 1 for 
glyphosate, wherein the plant exhibits tolerance to glyphosate applied at a level effective 

10 to inhibit the growth of the same plant lacking the heterologous gene, without significant 
yield reduction due to herbicide application. 

41. The transgenic plant of claim 40, wherein the glyphosate N- 
acetyltransferase catalyzes the acetylation of aminomethylphosphonic acid. 

42. An isolated or recombinant polypeptide comprising an amino acid 

15 sequence that can be optimally aligned with a sequence selected from the group consisting 
of SEQ ID NO:300, SEQ ID NO:445 and SEQ ID NO:457 to generate a similarity score of 
at least 430 using the BLOSUM62 matrix, a gap existence penalty of 1 1, and a gap 
extension penalty of 1, wherein the polypeptide has glyphosate N-acetyl transferase 
activity. 

20 43. The isolated or recombinant polypeptide of claim 42, wherein the 

polypeptide catalyzes the acetylation of glyphosate with a kcat/Km of at least 10 mMT 1 
min' 1 for glyphosate. 

44. The isolated or recombinant polypeptide of claim 43, wherein the 
polypeptide catalyzes the acetylation of glyphosate with a kcat/Km of at least 100 mM" 1 

25 min" 1 for glyphosate. 

45. The isolated or recombinant polypeptide of claim 44, wherein the 
polypeptide catalyzes the acetylation of aminomethylphosphonic acid. 

46. An isolated or recombinant polypeptide having glyphosate N- 
acetyltransferase activity, the polypeptide comprising an amino acid sequence comprising 

30 at least 20 contiguous amino acids pf an amino acid sequence selected from the group 
consisting of SEQ ID NO:445 and SEQ ID NO:457. 

47. The isolated or recombinant polypeptide of claim 46, wherein the 
polypeptide comprises an amino acid sequence comprising at least 50 contiguous amino 
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acids of an amino acid sequence selected from the group consisting of SEQ ID NO:300, 
SEQ ID NO:445 and SEQ ID NO:457. 

48. The isolated or recombinant polypeptide of claim 46, wherein the 
polypeptide comprises an amino acid sequence comprising at least 100 contiguous amino 

5 acids of an amino acid sequence selected from the group consisting of SEQ ID NO:300, 
SEQ ID NO:445 and SEQ ID NO:457. 

49. The isolated or recombinant polypeptide of claim 46, wherein the 
polypeptide comprises an amino acid sequence comprising about 140 contiguous amino 
acids of an amino acid sequence selected from the group consisting of SEQ ID NO:300, 

10 SEQ ID NO:445 and SEQ ID NO:457. 

50. The isolated or recombinant polypeptide of claim 46, wherein the 
polypeptide comprises an amino acid sequence selected from the group consisting of SEQ 
ID NO:300, SEQ ID NO:445 and SEQ ID NO:457. 

51 . The polynucleotide sequence of claim 42 further comprising an N- 
15 terminal chloroplast transit peptide. 

52. A non-native variant of the polypeptide of claim 42, wherein one or 
more amino acids of the polypeptide have been mutated. 

53. A non-native variant of the polypeptide of claim 42, wherein one or 
more amino acids of the polypeptide have been altered relative to a parental polypeptide. 

20 54. The polypeptide of claim 53, wherein the polypeptide is produced by a 

diversity generating procedure. 

55. The polypeptide of claim 54, wherein the diversity generating 

procedures comprises mutation or recombination of at least one parental polynucleotide 

encoding a glyphosate N-acetyltransf erase polypeptide. 
25 56. The polypeptide of claim 55, wherein the parental polynucleotide is a 

polynucleotide of claim 1. 

57. The polypeptide of claim 42 comprising a secretion sequence or a 
localization sequence. 

58. The polypeptide of claim 57 comprising a chloroplast transit sequence. 
30 59. A polypeptide which is specifically bound by a polyclonal antisera 

raised against one or more antigen, the antigen comprising an amino acid sequence 
selected from the group consisting of SEQ ID NO:300, SEQ ID NO:445 and SEQ ID 
NO:457. 

60. A polypeptide having GAT activity characterized by: 
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(a) a km for glyphosate of at least about 2 mM or less; 

(b) a km for acetyl CoA of at least about 200 |uM or less; and 

(c) a kcat equal to at least about 6/minute. 

61. A method of producing a glyphosate resistant transgenic plant or plant 
5 cell comprising: 

(a) transforming a plant or plant cell with a polynucleotide encoding a glyphosate N- 
acetyltransferase; and 

(b) optionally regenerating a transgenic plant from the transformed plant cell. 

62. The method of claim 61, wherein the polynucleotide is a 
10 polynucleotide of claim 1. 

63. The method of claim 61, wherein the polynucleotide is derived from a 
bacterial source. 

64. The method of claim 61, comprising growing the transformed plant or 
plant cell in a concentration of glyphosate that inhibits the growth of a wild-type plant of 

15 the same species, which concentration does not inhibit the growth of the transformed 
plant. 

65. The method of claim 64, comprising growing the transformed plant or 
plant cell or progeny of the plant or plant cell in increasing concentrations of glyphosate. 

66. The method of claim 64, comprising growing the transformed plant or 
20 plant cell in a concentration of glyphosate that is lethal to a wild-type plant or plant cell of 

the same species. 

67. The method of claim 62, which comprises propagating a plant 
transformed with the polynucleotide of claim 1. 

68. The method of claim 67, wherein a first plant is propagated by crossing 
25 between the first plant and a second plant, such that at least some progeny of the cross 

display glyphosate tolerance. 

69. A method for producing a variant of a polynucleotide of claim 1 
comprising recursively recombining a polynucleotide of claim 1 with a second 
polynucleotide, thereby forming a library of variant polynucleotides. 

30 70. The method of claim 69, comprising selecting a variant polynucleotide 

from the library on the basis of glyphosate N-acetyltransferase activity. 

71. The method of claim 70, wherein the recursive recombination is 
performed in vitro. 
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72. The method of claim 70, wherein the recursive recombination is 
performed in vivo. 

73. The method of claim 70, wherein the recursive recombination is 
performed in silico. 

74. The method of claim 70, wherein the recursive recombination 
comprises family shuffling. 

75. The method of claim 70, wherein the recursive recombination 
comprises a synthetic shuffling method. 

76. The method of claim 70, comprising replacing at least one parental 
codon in a nucleotide sequence with a synonymous codon that is preferentially used in 
plants relative to the parental codon. 

77. A library of variant polynucleotides produced by the method of claim 

70. 

78. A population of cells comprising the library of claim 77. 

79. A recombinant polynucleotide produced by the method of claim 70, 
wherein the recombinant polynucleotide encodes a polypeptide with glyphosate N- 
acetyltransferase activity. 

80. A cell comprising the polynucleotide of claim 79. 

81. The cell of claim 80, wherein the cell is a plant cell. 

82. The cell of claim 81, wherein the cell is a transgenic plant cell. 

83. A seed produced by the plant of claim 82. 

84. A polypeptide encoded by the polynucleotide of claim 79. 

85. A method for producing a variant of a polynucleotide of claim 1 
comprising mutating the polynucleotide. 

86. A polynucleotide produced by the method of claim 85. 

87. A method for selecting a plant or cell containing a nucleic acid 
construct, the method comprising: 

(a) providing a transgenic plant or cell containing a nucleic acid construct, wherein the 
nucleic acid construct comprises a nucleotide sequence that encodes a glyphosate N- 
acetyltransferase; 

(b) growing the plant or cell in the presence of glyphosate under conditions where the 

glyphosate N-acetyltransferase is expressed at an effective level, whereby the 

transgenic plant or cell grows at a rate that is discernibly greater than the plant or cell 

would grow if it did not contain the nucleic acid construct. 
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88. The method of claim 87, wherein the nucleic acid construct comprises 
a second nucleotide sequence encoding a polypeptide and a regulatory sequence operably 
linked to the second nucleotide sequence. 

89. A method for selectively controlling weeds in a field containing a crop 

5 comprising: 

(a) planting the field with crop seeds or plants which are glyphosate-tolerant as a result of 
being transformed with a gene encoding a glyphosate N-acteyltransferase; and 

(b) applying to the crop and weeds in the field a sufficient amount of glyphosate to control 
the weeds without significantly affecting the crop. 

10 90. A method of producing a genetically transformed plant that is tolerant 

toward glyphosate, comprising: 

(a) inserting into the genome of a plant cell a recombinant, double-stranded DNA 
molecule comprising: 

(i) a promoter which functions in plant cells to cause the production of an RNA 
15 sequence; 

(ii) a structural DNA sequence that causes the production of an RNA sequence which 
encodes a polypeptide of claim 42; and 

(iii) a y non-translated region which functions in plant cells to cause the addition of a 
stretch of polyadenyl nucleotides to the 3' end of the RNA sequence; 

20 where the promoter is heterologous with respect to the structural DNA sequence 

and adapted to cause sufficient expression of the encoded polypeptide to enhance 
the glyphosate tolerance of a plant cell transformed with the DNA molecule; 

b) obtaining a transformed plant cell; and 

c) regenerating from the transformed plant cell a genetically transformed plant which has 
25 increased tolerance to glyphosate. 

91. A method for producing a crop comprising: 

(a) growing a crop plant that is glyphosate-tolerant as a result of being transformed with a 
gene encoding a glyphosate N-acteyltransferase, under conditions such that the crop 
plant produces a crop; and 
30 (b) harvesting a crop from the crop plant. 

92. The method of claim 91 that comprises applying glyphosate to the 
crop plant at a concentration effective to control weeds. 

93. The method of claim 92, where the crop is cotton, corn, or soybean. 
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94. The isolated or recombinant polynucleotide of claim 1, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 90% conform to the following restrictions: 

(a) at positions 2, 4, 15, 19, 26, 28, 31, 45, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 123, 
5 129, 139, and/or 145 the amino acid residue is Bl; and 

(b) at positions 3, 5, 8, 10, 11, 14, 17, 18, 24, 27, 32, 37, 38, 47, 48, 49, 52, 57, 58, 61, 62, 
63, 68, 69, 79, 80, 82, 83, 89, 92, 100, 101, 104, 119, 120, 124, 125, 126, 128, 131, 
143, and/or 144 the amino acid residue is B2; 

wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
10 and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 

G, H, K, P, S, andT. 

95. The isolated or recombinant polynucleotide of claim 1, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 80% conform to the following restrictions: 

15 (a) at positions 2, 4, 15, 19, 26, 28, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 129, 
139, and/or 145 the amino acid residue is Zl; 

(b) at positions 31 and/or 45 the amino acid residue is Z2; 

(c) at positions 8 and/or 89 the amino acid residue is Z3; 

(d) at positions 82, 92, 101 and/or 120 the amino acid residue is Z4; 
20 (e) at positions 3, 11, 27 and/or 79 the amino acid residue is Z5; 

(f) at position 123 the amino acid residue is Zl or Z2; 

(g) at positions 12, 33, 35, 39, 53, 59, 112, 132, 135, 140, and/or 146 the amino acid 
residue is Zl or Z3; 

(h) at position 30 the amino acid residue is Zl or Z4; 
25 (i) at position 6 the amino acid residue is Zl or Z6; 

(j) at positions 81 and/or 1 13 the amino acid residue is Z2 or Z3; 
(k) at positions 138 and/or 142 the amino acid residue is Z2 or Z4; 
(1) at positions 5, 17, 24, 57, 61, 124 and/or 126 the amino acid residue is Z3 or Z4; 
(m) at position 104 the amino acid residue is Z3 or Z5; 
30 (o) at positions 38, 52, 62 and/or 69 the amino acid residue is Z3 or Z6; 
(p) at positions 14, 119 and/or 144 the amino acid residue is Z4 or Z5; 
(q) at position 18 the amino acid residue is Z4 or Z6; 

(r) at positions 10, 32, 48, 63, 80 and/or 83 the amino acid residue is Z5 or Z6; 

(s) at position 40 the amino acid residue is Zl, Z2 or Z3; 
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(t) at positions 65 and/or 96 the amino acid residue is Zl, Z3 or Z5; 
(u) at positions 84 and/or 115 the amino acid residue is Zl, Z3 or Z4; 
(v) at position 93 the amino acid residue is Z2, Z3 or Z4; 
(w) at position 130 the amino acid residue is Z2, Z4 or Z6; 
5 (x) at positions 47 and/or 58 the amino acid residue is Z3, Z4 or Z6; 

(y) at positions 49, 68, 100 and/or 143 the amino acid residue is Z3, Z4 or Z5; 
(z) at position 131 the amino acid residue is Z3, Z5 or Z6; 

(aa) at positions 125 and/or 128 the amino acid residue is Z4, Z5 or Z6; 

(ab) at position 67 the amino acid residue is Zl, Z3, Z4 or Z5; 

10 (ac) at position 60 the amino acid residue is Zl, Z4, Z5 or Z6; and 
(ad) at position 37 the amino acid residue is Z3, Z4, Z5 or Z6; 

wherein Zl is an amino acid selected from the group consisting of A, I, L, M, and V; Z2 is 
an amino acid selected from the group consisting of F, W, and Y; Z3 is an amino acid 
selected from the group consisting of N, Q, S, and T; Z4 is an amino acid selected 
15 from the group consisting of R, H, and K; Z5 is an amino acid selected from the group 

consisting of D and E; and Z6 is an amino acid selected from the group consisting of 
C, G, and P. 

96. The isolated or recombinant polynucleotide of claim 1, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
20 at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

118, 121, and/or 141 the amino acid residue is Bl; and 

(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
25 99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 

and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 

G, H, K, P, S, andT. 

30 97. The isolated or recombinant polynucleotide of claim 1, wherein of the 

amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 20, 36, 42, 50, 64, 72, 75, 76, 78, 94, 98, 110, 121, and/or 141 the 
amino acid residue is Zl ; 
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(b) at positions 13, 46, 56, 70, 107, 117, and/or 118 the amino acid residue is Z2; 

(c) at positions 23, 55, 71, 77, 88, and/or 109 the amino acid residue is Z3; 

(d) at positions 16, 21, 41, 73, 85, 99, and/or 111 the amino acid residue is Z4; 

(e) at positions 34 and/or 95 the amino acid residue is Z5; 

5 (f) at position 22, 25, 29, 43, 44, 66, 74, 87, 102, 108, 116, 122, 127, 133, 134, 136, and/or 
137 the amino acid residue is Z6; 

wherein Zl is an amino acid selected from the group consisting of A, I, L, M, and V; Z2 is 
an amino acid selected from the group consisting of F, W, and Y; Z3 is an amino acid 
selected from the group consisting of N, Q, S, and T; Z4 is an amino acid selected 
10 from the group consisting of R, H, and K; Z5 is an amino acid selected from the group 

consisting of D and E; and Z6 is an amino acid selected from the group consisting of 
C, G, and P. 

98. The isolated or recombinant polynucleotide of claim 94, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
15 at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

118, 121, and/or 141 the amino acid residue is Bl; and 

(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
20 99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 

and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 

G, H, K, P, S, and T. 

25 99. The isolated or recombinant polynucleotide of claim 94, wherein of the 

amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

30 118, 121, and/or 141 the amino acid residue is Bl; and 

(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
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wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 
G, H, K, P, S,andT. 

100. The isolated or recombinant polynucleotide of claim 94, wherein of 
5 the amino acid residues in the amino acid sequence that correspond to the following 

positions, at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

118, 121, and/or 141 the amino acid residue is Bl; and 
10 (b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 
15 G,H,K,P,S,andT. 

101. The isolated or recombinant polynucleotide of claim 95, wherein of 
the amino acid residues in the amino acid sequence that correspond to the following 
positions, at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
20 117, 

118, 121, and/or 141 the amino acid residue is Bl; and 

(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
25 wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 

and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 
G, H, K, P, S,and T. 

102. The isolated or recombinant polynucleotide of claim, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 

30 at least 80% conform to the following restrictions: 

(a) at position 2 the amino acid residue is I or L; 

(b) at position 3 the amino acid residue is E or D; 

(c) at position 4 the amino acid residue is V, A or I; 

(d) at position 5 the amino acid residue is K, R or N; 
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(e) at position 6 the amino acid residue is P or L; 

(f) at position 8 the amino acid residue is N, S or T; 

(g) at position 10 the amino acid residue is E or G; 

(h) at position 1 1 the amino acid residue is D or E; 
5 (i) at position 12 the amino acid residue is T or A; 

(j) at position 14 the amino acid residue is E or K; 

(k) at position 15 the amino acid residue is I or L; 

(1) at position 17 the amino acid residue is H or Q; 

(m) at position 18 the amino acid residue is R, C or K; 
10 (n) at position 19 the amino acid residue is I or V; 

(o) at position 24 the amino acid residue is Q or R; 

(p) at position 26 the amino acid residue is L or I; 

(q) at position 27 the amino acid residue is E or D; 

(r) at position 28 the amino acid residue is A or V; 
15 (s) at position 30 the amino acid residue is K, M or R; 

(t) at position 31 the amino acid residue is Y or F; 

(u) at position 32 the amino acid residue is E or G; 

(v) at position 33 the amino acid residue is T, A or S; 

(w) at position 35 the amino acid residue is L, S or M; 
20 (x) at position 37 the amino acid residue is R, G, E or Q; 

(y) at position 38 the amino acid residue is G or S; 

(z) at position 39 the amino acid residue is T, A or S; 

(aa) at position 40 the amino acid residue is F, L or S; 

(ab) at position 45 the amino acid residue is Y or F; 
25 (ac) at position 47 the amino acid residue is R, Q or G; 

(ad) at position 48 the amino acid residue is G or D; 

(ae) at position 49 the amino acid residue is K, R, E or Q; 

(af) at position 51 the amino acid residue is I or V; 

(ag) at position 52 the amino acid residue is S, C or G; 
30 (ah) at position 53 the amino acid residue is I or T; 

(ai) at position 54 the amino acid residue is A or V; 

(aj) at position 57 the amino acid residue is H or N; 

(ak) at position 58 the amino acid residue is Q, K, N or P; 

(al) at position 59 the amino acid residue is A or S; 
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(am) at position 60 the amino acid residue is E, K, G, V or D; 

(an) at position 61 the amino acid residue is H or Q; 

(ao) at position 62 the amino acid residue is P, S or T; 

(ap) at position 63 the amino acid residue is E, G or D; 
5 (aq) at position 65 the amino acid residue is E, D, V or Q; 

(ar) at position 67 the amino acid residue is Q, E, R, L, H or K; 

(as) at position 68 the amino acid residue is K, R, E, or N; 

(at) at position 69 the amino acid residue is Q or P; 

(au) at position 79 the amino acid residue is E or D; 
10 (av) at position 80 the amino acid residue is G or E; 

(aw) at position 81 the amino acid residue is Y, N or F; 

(ax) at position 82 the amino acid residue is R or H; 

(ay) at position 83 the amino acid residue is E, G or D; 

(az) at position 84 the amino acid residue is Q, R or L; 
15 (ba) at position 86 the amino acid residue is A or V; 

(bb) at position 89 the amino acid residue is T or S; 

(be) at position 90 the amino acid residue is L or I; 

(bd) at position 91 the amino acid residue is I or V; 

(be) at position 92 the amino acid residue is R or K; 
20 (bf) at position 93 the amino acid residue is H, Y or Q; 

(bg) at position 96 the amino acid residue is E, A or Q; 

(bh) at position 97 the amino acid residue is L or I; 

(bi) at position 100 the amino acid residue is K, R, N or E; 
(bj) at position 101 the amino acid residue is K or R; 

25 (bk) at position 103 the amino acid residue is A or V; 

(bl) at position 104 the amino acid residue is D or N; 

(bm) at position 105 the amino acid residue is L or M; 

(bn) at position 106 the amino acid residue is L or I; 

(bo) at position 1 12 the amino acid residue is T or I; 
30 (bp) at position 1 13 the amino acid residue is S, T or F; 

(bq) at position 114 the amino acid residue is A or V; 

(br) at position 115 the amino acid residue is S, R or A; 

(bs) at position 119 the amino acid residue is K, E or R; 

(bt) at position 120 the amino acid residue is K or R; 
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(bu) at position 123 the amino acid residue is F or L; 
(bv) at position 124 the amino acid residue is S or R; 
(bw) at position 125 the amino acid residue is E, K, G or D; 
(bx) at position 126 the amino acid residue is Q or H; 
5 (by) at position 128 the amino acid residue is E, G or K; 
(bz) at position 129 the amino acid residue is V, I or A; 

(ca) at position 130 the amino acid residue is Y, H, F or C; 

(cb) at position 131 the amino acid residue is D, G, N or E; 

(cc) at position 132 the amino acid residue is I, T, A, M, V or L; 
10 (cd) at position 135 the amino acid residue is V, T, A or I; 

(ce) at position 138 the amino acid residue is H or Y; 

(cf) at position 139 the amino acid residue is I or V; 
(eg) at position 140 the amino acid residue is L or S; 
(ch) at position 142 the amino acid residue is Y or H; 

15 (ci) at position 143 the amino acid residue is K, T or E; 

(cj) at position 144 the amino acid residue is K, E or R; 

(ck) at position 145 the amino acid residue is L or I; and 

(cl) at position 146 the amino acid residue is T or A. 

103. The isolated or recombinant polynucleotide of claim 1, wherein of the 
20 amino acid residues in the amino acid sequence that correspond to the following positions, 

at least 80% conform to the following restrictions: 

(a) at position 9, 76, 94 and 1 10 the amino acid residue is A; 

(b) at position 29 and 108 the amino acid residue is C; 

(c) at position 34 the amino acid residue is D; 
25 (d) at position 95 the amino acid residue is E; 

(e) at position 56 the amino acid residue is F; 

(f) at position 43, 44, 66, 74, 87, 102, 116, 122, 127 and 136 the amino acid residue is G; 

(g) at position 41 the amino acid residue is H; 

(h) at position 7 the amino acid residue is I; 
30 (i) at position 85 the amino acid residue is K; 

(j) at position 20, 36, 42, 50, 72, 78, 98 and 121 the amino acid residue is L; 

(k) at position 1, 75 and 141 the amino acid residue is M; 

(1) at position 23, 64 and 109 the amino acid residue is N; 

(m) at position 22, 25, 133, 134 and 137 the amino acid residue is P; 
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(n) at position 71 the amino acid residue is Q; 
(o) at position 16, 21, 73, 99 and 111 the amino acid residue is R; 
(p) at position 55 and 88 the amino acid residue is S; 
(q) at position 77 the amino acid residue is T; 
5 (r) at position 107 the amino acid residue is W; and 

(s) at position 13, 46, 70, 117 and 118 the amino acid residue is Y. 

104. The isolated or recombinant polynucleotide of claim 102, wherein of 
the amino acid residues in the amino acid sequence that correspond to the following 
positions, at least 90% conform to the following restrictions: 

10 (a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

118, 121, and/or 141 the amino acid residue is Bl; and 

(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

15 102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 
G, H, K, P, S, andT. 

105. The isolated or recombinant polynucleotide of claim 103, wherein of 
20 the amino acid residues in the amino acid sequence that correspond to the following 

positions, at least 90% conform to the following restrictions: 

(a) at positions 2, 4, 15, 19, 26, 28, 31, 45, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 123, 
129, 139, and/or 145 the amino acid residue is Bl; and 

(b) at positions 3, 5, 8, 10, 11, 14, 17, 18, 24, 27, 32, 37, 38, 47, 48, 49, 52, 57, 58, 61, 62, 
25 63, 68, 69, 79, 80, 82, 83, 89, 92, 100, 101, 104, 119, 120, 124, 125, 126, 128, 131, 

143, and/or 144 the amino acid residue is B2; 
wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 
G, H, K, P, S,andT. 

30 106. The isolated or recombinant polynucleotide of claim 102, wherein of 

the amino acid residues in the amino acid sequence that correspond to the following 
positions, at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 20, 36, 42, 50, 64, 72, 75, 76, 78, 94, 98, 110, 121, and/or 141 the 
amino acid residue is Zl; 
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(b) at positions 13, 46, 56, 70, 107, 117, and/or 118 the amino acid residue is Z2; 

(c) at positions 23, 55, 71, 77, 88, and/or 109 the amino acid residue is Z3; 

(d) at positions 16, 21, 41, 73, 85, 99, and/or 111 the amino acid residue is Z4; 

(e) at positions 34 and/or 95 the amino acid residue is Z5; 

5 (f) at position 22, 25, 29, 43, 44, 66, 74, 87, 102, 108, 116, 122, 127, 133, 134, 136, and/or 
137 the amino acid residue is Z6; 

wherein Zl is an amino acid selected from the group consisting of A, I, L, M, and V; Z2 is 
an amino acid selected from the group consisting of F, W, and Y; Z3 is an amino acid 
selected from the group consisting of N, Q, S, and T; Z4 is an amino acid selected 
10 from the group consisting of R, H, and K; Z5 is an amino acid selected from the group 

consisting of D and E; and Z6 is an amino acid selected from the group consisting of 
C, G, and P. 

107. The isolated or recombinant polynucleotide of claim 103, wherein of 
the amino acid residues in the amino acid sequence that correspond to the following 
15 positions, at least 80% conform to the following restrictions: 

(a) at positions 2, 4, 15, 19, 26, 28, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 129, 
139, and/or 145 the amino acid residue is Zl; 

(b) at positions 31 and/or 45 the amino acid residue is Z2; 

(c) at positions 8 and/or 89 the amino acid residue is Z3; 

20 (d) at positions 82, 92, 101 and/or 120 the amino acid residue is Z4; 

(e) at positions 3, 11, 27 and/or 79 the amino acid residue is Z5; 

(f) at position 123 the amino acid residue is Zl or Z2; 

(g) at positions 12, 33, 35, 39, 53, 59, 112, 132, 135, 140, and/or 146 the amino acid 
residue is Zl or Z3; 

25 (h) at position 30 the amino acid residue is Zl or Z4; 

(i) at position 6 the amino acid residue is Zl or Z6; 

(j) at positions 81 and/or 113 the amino acid residue is Z2 or Z3; 

(k) at positions 138 and/or 142 the amino acid residue is Z2 or Z4; 

(1) at positions 5, 17, 24, 57, 61, 124 and/or 126 the amino acid residue is Z3 or Z4; 
30 (m) at position 104 the amino acid residue is Z3 or Z5; 

(o) at positions 38, 52, 62 and/or 69 the amino acid residue is Z3 or Z6; 

(p) at positions 14, 1 19 and/or 144 the amino acid residue is Z4 or Z5; 

(q) at position 18 the amino acid residue is Z4 or Z6; 

(r) at positions 10, 32, 48, 63, 80 and/or 83 the amino acid residue is Z5 or Z6; 
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(s) at position 40 the amino acid residue is Zl, Z2 or Z3; 
(t) at positions 65 and/or 96 the amino acid residue is Zl, Z3 or Z5; 
(u) at positions 84 and/or 115 the amino acid residue is Zl, Z3 or Z4; 
(v) at position 93 the amino acid residue is Z2, Z3 or Z4; 
5 (w) at position 130 the amino acid residue is Z2, Z4 or Z6; 

(x) at positions 47 and/or 58 the amino acid residue is Z3, Z4 or Z6; 
(y) at positions 49, 68, 100 and/or 143 the amino acid residue is Z3, Z4 or Z5; 
(z) at position 131 the amino acid residue is Z3, Z5 or Z6; 
(aa) at positions 125 and/or 128 the amino acid residue is Z4, Z5 or Z6; 
10 (ab) at position 67 the amino acid residue is Zl, Z3, Z4 or Z5; 

(ac) at position 60 the amino acid residue is Zl, Z4, Z5 or Z6; and 

(ad) at position 37 the amino acid residue is Z3, Z4, Z5 or Z6; 

wherein Zl is an amino acid selected from the group consisting of A, I, L, M, and V; Z2 is 
an amino acid selected from the group consisting of F, W, and Y; Z3 is an amino acid 
15 selected from the group consisting of N, Q, S, and T; Z4 is an amino acid selected 

from the group consisting of R, H, and K; Z5 is an amino acid selected from the group 
consisting of D and E; and Z6 is an amino acid selected from the group consisting of 
C, G, and P. 

108. The isolated or recombinant polynucleotide of claim 102, wherein of 
20 the amino acid residues in the amino acid sequence that correspond to the following 
positions, at least 80% conform to the following restrictions: 

(a) at position 9, 76, 94 and 1 10 the amino acid residue is A; 

(b) at position 29 and 108 the amino acid residue is C; 

(c) at position 34 the amino acid residue is D; 
25 (d) at position 95 the amino acid residue is E; 

(e) at position 56 the amino acid residue is F; 

(f) at position 43, 44, 66, 74, 87, 102, 116, 122, 127 and 136 the amino acid residue is G; 

(g) at position 41 the amino acid residue is H; 

(h) at position 7 the amino acid residue is I; 
30 (i) at position 85 the amino acid residue is K; 

(j) at position 20, 36, 42, 50, 72, 78, 98 and 121 the amino acid residue is L; 

(k) at position 1, 75 and 141 the amino acid residue is M; 

(1) at position 23, 64 and 109 the amino acid residue is N; 

(m) at position 22, 25, 133, 134 and 137 the amino acid residue is P; 
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(n) at position 71 the amino acid residue is Q; 
(o) at position 16, 21, 73, 99 and 111 the amino acid residue is R; 
(p) at position 55 and 88 the amino acid residue is S; 
(q) at position 77 the amino acid residue is T; 
5 (r) at position 107 the amino acid residue is W; and 

(s) at position 13, 46, 70, 117 and 118 the amino acid residue is Y. 

109. The isolated or recombinant polynucleotide of claim 1, wherein the 
amino acid residue in the amino acid sequence that correspond to position 28 is V. 

110. The isolated or recombinant polynucleotide of claim 1, wherein the 
10 amino acid sequence is selected from the group consisting of SEQ ID NOS:6-10 and 263- 

514. 

111. The isolated or recombinant polypeptide of claim 42, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 90% conform to the following restrictions: 

15 (a) at positions 2, 4, 15, 19, 26, 28, 31, 45, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 123, 

129, 139, and/or 145 the amino acid residue is Bl; and 
(b) at positions 3, 5, 8, 10, 11, 14, 17, 18, 24, 27, 32, 37, 38, 47, 48, 49, 52, 57, 58, 61, 62, 

63, 68, 69, 79, 80, 82, 83, 89, 92, 100, 101, 104, 119, 120, 124, 125, 126, 128, 131, 

143, and/or 144 the amino acid residue is B2; 
20 wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 

and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 

G, H, K, P,S, andT. 

112. The isolated or recombinant polypeptide of claim 42, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 

25 at least 80% conform to the following restrictions: 

(a) at positions 2, 4, 15, 19, 26, 28, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 129, 
139, and/or 145 the amino acid residue is Zl; 

(b) at positions 31 and/or 45 the amino acid residue is Z2; 

(c) at positions 8 and/or 89 the amino acid residue is Z3; 

30 (d) at positions 82, 92, 101 and/or 120 the amino acid residue is Z4; 

(e) at positions 3, 11, 27 and/or 79 the amino acid residue is Z5; 

(f) at position 123 the amino acid residue is Zl or Z2; 

(g) at positions 12, 33, 35, 39, 53, 59, 112, 132, 135, 140, and/or 146 the amino acid 
residue is Zl or Z3; 
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(h) at position 30 the amino acid residue is Zl or Z4; 

(i) at position 6 the amino acid residue is Zl or Z6; 

(j) at positions 81 and/or 1 13 the amino acid residue is Z2 or Z3; 

(k) at positions 138 and/or 142 the amino acid residue is Z2 or Z4; 
5 (1) at positions 5, 17, 24, 57, 61, 124 and/or 126 the amino acid residue is Z3 or Z4; 

(m) at position 104 the amino acid residue is Z3 or Z5; 

(o) at positions 38, 52, 62 and/or 69 the amino acid residue is Z3 or Z6; 

(p) at positions 14, 119 and/or 144 the amino acid residue is Z4 or Z5; 

(q) at position 18 the amino acid residue is Z4 or Z6; 
10 (r) at positions 10, 32, 48, 63, 80 and/or 83 the amino acid residue is Z5 or Z6; 

(s) at position 40 the amino acid residue is Zl, Z2 or Z3; 

(t) at positions 65 and/or 96 the amino acid residue is Zl, Z3 or Z5; 

(u) at positions 84 and/or 1 15 the amino acid residue is Zl, Z3 or Z4; 

(v) at position 93 the amino acid residue is Z2, Z3 or Z4; 
15 (w) at position 130 the amino acid residue is Z2, Z4 or Z6; 

(x) at positions 47 and/or 58 the amino acid residue is Z3, Z4 or Z6; 

(y) at positions 49, 68, 100 and/or 143 the amino acid residue is Z3, Z4 or Z5; 

(z) at position 131 the amino acid residue is Z3, Z5 or Z6; 

(aa) at positions 125 and/or 128 the amino acid residue is Z4, Z5 or Z6; 
20 (ab) at position 67 the amino acid residue is Zl, Z3, Z4 or Z5; 

(ac) at position 60 the amino acid residue is Zl, Z4, Z5 or Z6; and 

(ad) at position 37 the amino acid residue is Z3, Z4, Z5 or Z6; 

wherein Zl is an amino acid selected from the group consisting of A, I, L, M, and V; Z2 is 
an amino acid selected from the group consisting of F, W, and Y; Z3 is an amino acid 
25 selected from the group consisting of N, Q, S, and T; Z4 is an amino acid selected 

from the group consisting of R, H, and K; Z5 is an amino acid selected from the group 
consisting of D and E; and Z6 is an amino acid selected from the group consisting of 
C, G, and P. 

113. The isolated or recombinant polypeptide of claim 42, wherein of the 
30 amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

118, 121, and/or 141 the amino acid residue is Bl; and 
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(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
5 and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 

G, H, K, P, S,and T. 

1 14. The isolated or recombinant polypeptide of claim 42, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 90% conform to the following restrictions: 

10 (a) at positions 1, 7, 9, 20, 36, 42, 50, 64, 72, 75, 76, 78, 94, 98, 110, 121, and/or 141 the 
amino acid residue is Zl; 

(b) at positions 13, 46, 56, 70, 107, 117, and/or 118 the amino acid residue is Z2; 

(c) at positions 23, 55, 71, 77, 88, and/or 109 the amino acid residue is Z3; 

(d) at positions 16, 21, 41, 73, 85, 99, and/or 111 the amino acid residue is Z4; 
15 (e) at positions 34 and/or 95 the amino acid residue is Z5; 

(f) at position 22, 25, 29, 43, 44, 66, 74, 87, 102, 108, 116, 122, 127, 133, 134, 136, and/or 
137 the amino acid residue is Z6; 

wherein Zl is an amino acid selected from the group consisting of A, I, L, M, and V; Z2 is 
an amino acid selected from the group consisting of F, W, and Y; Z3 is an amino acid 
20 selected from the group consisting of N, Q, S, and T; Z4 is an amino acid selected 

from the group consisting of R, H, and K; Z5 is an amino acid selected from the group 
consisting of D and E; and Z6 is an amino acid selected from the group consisting of 
C, G, and P. 

115. The isolated or recombinant polypeptide of claim 111, wherein of the 
25 amino acid residues in the amino acid sequence that correspond to the following positions, 

at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

118, 121, and/or 141 the amino acid residue is Bl; and 
30 (b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
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wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 
G, H, K, P, S, andT. 

116. The isolated or recombinant polypeptide of claim 111, wherein of the 
5 amino acid residues in the amino acid sequence that correspond to the following positions, 

at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

118, 121, and/or 141 the amino acid residue is Bl; and 
10 (b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
and V; and B2 is an amino acid selected from the group consisting of R, N, D, C; Q, E, 
15 G,H,K,P,S,andT. 

117. The isolated or recombinant polypeptide of claim 111, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
20 117, 

118, 121, and/or 141 the amino acid residue is Bl; and 

(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
25 wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 

and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 
G, H, K, P,S, andT. 

118. The isolated or recombinant polypeptide of claim 112, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 

30 at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

118, 121, and/or 141 the amino acid residue is Bl; and 
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(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 
wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
5 and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 

G, H, K, P,S,and T. 

1 19. The isolated or recombinant polypeptide of claim 42, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 80% conform to the following restrictions: 
10 (a) at position 2 the amino acid residue is I or L; 

(b) at position 3 the amino acid residue is E or D; 

(c) at position 4 the amino acid residue is V, A or I; 

(d) at position 5 the amino acid residue is K, R or N; 

(e) at position 6 the amino acid residue is P or L; 
15 (f) at position 8 the amino acid residue is N, S or T; 

(g) at position 10 the amino acid residue is E or G; 

(h) at position 1 1 the amino acid residue is D or E; 

(i) at position 12 the amino acid residue is T or A; 
(j) at position 14 the amino acid residue is E or K; 

20 (k) at position 15 the amino acid residue is I or L; 

(1) at position 17 the amino acid residue is H or Q; 

(m) at position 18 the amino acid residue is R, C or K; 

(n) at position 19 the amino acid residue is I or V; 

(o) at position 24 the amino acid residue is Q or R; 
25 (p) at position 26 the amino acid residue is L or I; 

(q) at position 27 the amino acid residue is E or D; 

(r) at position 28 the amino acid residue is A or V; 

(s) at position 30 the amino acid residue is K, M or R; 

(t) at position 31 the amino acid residue is Y or F; 
30 (u) at position 32 the amino acid residue is E or G; 

(v) at position 33 the amino acid residue is T, A or S; 

(w) at position 35 the amino acid residue is L, S or M; 

(x) at position 37 the amino acid residue is R, G, E or Q; 

(y) at position 38 the amino acid residue is G or S; 
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(z) at position 39 the amino acid residue is T, A or S; 

(aa) at position 40 the amino acid residue is F, L or S; 

(ab) at position 45 the amino acid residue is Y or F; 

(ac) at position 47 the amino acid residue is R, Q or G; 
5 (ad) at position 48 the amino acid residue is G or D; 

(ae) at position 49 the amino acid residue is K, R, E or Q; 

(af) at position 51 the amino acid residue is I or V; 

(ag) at position 52 the amino acid residue is S, C or G; 

(ah) at position 53 the amino acid residue is I or T; 
10 (ai) at position 54 the amino acid residue is A or V; 

(aj) at position 57 the amino acid residue is H or N; 

(ak) at position 58 the amino acid residue is Q, K, N or P; 

(al) at position 59 the amino acid residue is A or S; 

(am) at position 60 the amino acid residue is E, K, G, V or D; 
15 (an) at position 61 the amino acid residue is H or Q; 

(ao) at position 62 the amino acid residue is P, S or T; 

(ap) at position 63 the amino acid residue is E, G or D; 

(aq) at position 65 the amino acid residue is E, D, V or Q; 

(ar) at position 67 the amino acid residue is Q, E, R, L, H or K; 
20 (as) at position 68 the amino acid residue is K, R, E, or N; 

(at) at position 69 the amino acid residue is Q or P; 

(au) at position 79 the amino acid residue is E or D; 

(av) at position 80 the amino acid residue is G or E; 

(aw) at position 81 the amino acid residue is Y, N or F; 
25 (ax) at position 82 the amino acid residue is R or H; 

(ay) at position 83 the amino acid residue is E, G or D; 

(az) at position 84 the amino acid residue is Q, R or L; 

(ba) at position 86 the amino acid residue is A or V; 

(bb) at position 89 the amino acid residue is T or S; 
30 (be) at position 90 the amino acid residue is L or I; 

(bd) at position 91 the amino acid residue is I or V; 

(be) at position 92 the amino acid residue is R or K; 

(bf) at position 93 the amino acid residue is H, Y or Q; 

(bg) at position 96 the amino acid residue is E, A or Q; 
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(bh) at position 97 the amino acid residue is L or I; 

(bi) at position 100 the amino acid residue is K, R, N or E; 
(bj) at position 101 the amino acid residue is K or R; 
(bk) at position 103 the amino acid residue is A or V; 
(bl) at position 104 the amino acid residue is D or N; 
(bm) at position 105 the amino acid residue is L or M; 
(bn) at position 106 the amino acid residue is L or I; 

(bo) at position 1 12 the amino acid residue is T or I; 
(bp) at position 1 13 the amino acid residue is S, T or F; 
(bq) at position 1 14 the amino acid residue is A or V; 
(br) at position 115 the amino acid residue is S, R or A; 
(bs) at position 119 the amino acid residue is K, E or R; 
(bt) at position 120 the amino acid residue is K or R; 
(bu) at position 123 the amino acid residue is F or L; 
(bv) at position 124 the amino acid residue is S or R; 
(bw) at position 125 the amino acid residue is E, K, G or D; 
(bx) at position 126 the amino acid residue is Q or H; 
(by) at position 128 the amino acid residue is E, G or K; 
(bz) at position 129 the amino acid residue is V, I or A; 

(ca) at position 130 the amino acid residue is Y, H, F or C; 

(cb) at position 131 the amino acid residue is D, G, N or E; 

(cc) at position 132 the amino acid residue is I, T, A, M, V or L; 

(cd) at position 135 the amino acid residue is V, T, A or I; 

(ce) at position 138 the amino acid residue is H or Y; 

(cf) at position 139 the amino acid residue is I or V; 
(eg) at position 140 the amino acid residue is L or S; 

(ch) at position 142 the amino acid residue is Y or H; 

(ci) at position 143 the amino acid residue is K, T or E; 
(cj) at position 144 the amino acid residue is K, E or R; 
(ck) at position 145 the amino acid residue is L or I; and 
(cl) at position 146 the amino acid residue is T or A. 

120. The isolated or recombinant polypeptide of claim 42, wherein of the 

amino acid residues in the amino acid sequence that correspond to the following positions, 

at least 80% conform to the following restrictions: 
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(a) at position 9, 76, 94 and 1 10 the amino acid residue is A; 

(b) at position 29 and 108 the amino acid residue is C; 

(c) at position 34 the amino acid residue is D; 

(d) at position 95 the amino acid residue is E; 
5 (e) at position 56 the amino acid residue is F; 

(f) at position 43, 44, 66, 74, 87, 102, 116, 122, 127 and 136 the amino acid residue is G; 

(g) at position 41 the amino acid residue is H; 

(h) at position 7 the amino acid residue is I; 

(i) at position 85 the amino acid residue is K; 

10 (j) at position 20, 36, 42, 50, 72, 78, 98 and 121 the amino acid residue is L; 

(k) at position 1, 75 and 141 the amino acid residue is M; 

(1) at position 23, 64 and 109 the amino acid residue is N; 

(m) at position 22, 25, 133, 134 and 137 the amino acid residue is P; 

(n) at position 71 the amino acid residue is Q; 
15 (o) at position 16, 21, 73, 99 and 1 11 the amino acid residue is R; 

(p) at position 55 and 88 the amino acid residue is S; 

(q) at position 77 the amino acid residue is T; 

(r) at position 107 the amino acid residue is W; and 

(s) at position 13, 46, 70, 117 and 118 the amino acid residue is Y. 
20 121. The isolated or recombinant polypeptide of claim 1 19, wherein of the 

amino acid residues in the amino acid sequence that correspond to the following positions, 

at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 13, 20, 36, 42, 46, 50, 56, 64, 70, 72, 75, 76, 78, 94, 98, 107, 110, 
117, 

25 118, 121, and/or 141 the amino acid residue is Bl; and 

(b) at positions 16, 21, 22, 23, 25, 29, 34, 41, 43, 44, 55, 66, 71, 73, 74, 77, 85, 87, 88, 95, 
99, 

102, 108, 109, 111, 116, 122, 127, 133, 134, 136, and/or 137 the amino acid residue is B2; 

wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 

30 and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 

G, H, K, P, S, and T. 

122. The isolated or recombinant polypeptide of claim 120, wherein of the 

amino acid residues in the amino acid sequence that correspond to the following positions, 

at least 90% conform to the following restrictions: 
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(a) at positions 2, 4, 15, 19, 26, 28, 31, 45, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 123, 
129, 139, and/or 145 the amino acid residue is Bl; and 

(b) at positions 3, 5, 8, 10, 11, 14, 17, 18, 24, 27, 32, 37, 38, 47, 48, 49, 52, 57, 58, 61, 62, 
63, 68, 69, 79, 80, 82, 83, 89, 92, 100, 101, 104, 119, 120, 124, 125, 126, 128, 131, 

5 143, and/or 144 the amino acid residue is B2; 

wherein Bl is an amino acid selected from the group consisting of A, I, L, M, F, W, Y, 
and V; and B2 is an amino acid selected from the group consisting of R, N, D, C, Q, E, 
G, H, K, P,S,and T. 

123, The isolated or recombinant polypeptide of claim 1 19, wherein of 
10 the amino acid residues in the amino acid sequence that correspond to the following 

positions, at least 90% conform to the following restrictions: 

(a) at positions 1, 7, 9, 20, 36, 42, 50, 64, 72, 75, 76, 78, 94, 98, 110, 121, and/or 141 the 
amino acid residue is Zl; 

(b) at positions 13, 46, 56, 70, 107, 117, and/or 118 the amino acid residue is Z2; 
15 (c) at positions 23, 55, 71, 77, 88, and/or 109 the amino acid residue is Z3; 

(d) at positions 16, 21, 41, 73, 85, 99, and/or 111 the amino acid residue is Z4; 

(e) at positions 34 and/or 95 the amino acid residue is Z5; 

(f) at position 22, 25, 29, 43, 44, 66, 74, 87, 102, 108, 116, 122, 127, 133, 134, 136, and/or 
137 the amino acid residue is Z6; 

20 wherein Zl is an amino acid selected from the group consisting of A, I, L, M, and V; Z2 is 
an amino acid selected from the group consisting of F, W, and Y; Z3 is an amino acid 
selected from the group consisting of N, Q, S, and T; Z4 is an amino acid selected 
from the group consisting of R, H, and K; Z5 is an amino acid selected from the group 
consisting of D and E; and Z6 is an amino acid selected from the group consisting of 

25 C, G, and P. 

124. The isolated or recombinant polypeptide of claim 120, wherein of the 
amino acid residues in the amino acid sequence that correspond to the following positions, 
at least 80% conform to the following restrictions: 

(a) at positions 2, 4, 15, 19, 26, 28, 51, 54, 86, 90, 91, 97, 103, 105, 106, 114, 129, 
30 139, and/or 145 the amino acid residue is Zl; 

(b) at positions 31 and/or 45 the amino acid residue is Z2; 

(c) at positions 8 and/or 89 the amino acid residue is Z3; 

(d) at positions 82, 92, 101 and/or 120 the amino acid residue is Z4; 

(e) at positions 3, 1 1, 27 and/or 79 the amino acid residue is Z5; 
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(f) at position 123 the amino acid residue is Zl or Z2; 

(g) at positions 12, 33, 35, 39, 53, 59, 112, 132, 135, 140, and/or 146 the amino acid 
residue is Zl or Z3; 

(h) at position 30 the amino acid residue is Zl or Z4; 
5 (i) at position 6 the amino acid residue is Zl or Z6; 

(j) at positions 81 and/or 1 13 the amino acid residue is Z2 or Z3; 
(k) at positions 138 and/or 142 the amino acid residue is Z2 or Z4; 
(1) at positions 5, 17, 24, 57, 61, 124 and/or 126 the amino acid residue is Z3 or Z4; 
(m) at position 104 the amino acid residue is Z3 or Z5; 
10 (o) at positions 38, 52, 62 and/or 69 the amino acid residue is Z3 or Z6; 
(p) at positions 14, 1 19 and/or 144 the amino acid residue is Z4 or Z5; 
(q) at position 18 the amino acid residue is Z4 or Z6; 

(r) at positions 10, 32, 48, 63, 80 and/or 83 the amino acid residue is Z5 or Z6; 

(s) at position 40 the amino acid residue is Zl, Z2 or Z3; 
15 (t) at positions 65 and/or 96 the amino acid residue is Zl, Z3 or Z5; 

(u) at positions 84 and/or 115 the amino acid residue is Zl, Z3 or Z4; 

(v) at position 93 the amino acid residue is Z2, Z3 or Z4; 

(w) at position 130 the amino acid residue is Z2, Z4 or Z6; 

(x) at positions 47 and/or 58 the amino acid residue is Z3, Z4 or Z6; 
20 (y) at positions 49, 68, 100 and/or 143 the amino acid residue is Z3, Z4 or Z5; 

(z) at position 131 the amino acid residue is Z3, Z5 or Z6; 

(aa) at positions 125 and/or 128 the amino acid residue is Z4, Z5 or Z6; 

(ab) at position 67 the amino acid residue is Zl, Z3, Z4 or Z5; 

(ac) at position 60 the amino acid residue is Zl, Z4, Z5 or Z6; and 
25 (ad) at position 37 the amino acid residue is Z3, Z4, Z5 or Z6; 

wherein Zl is an amino acid selected from the group consisting of A, I, L, M, and V; Z2 is 
an amino acid selected from the group consisting of F, W, and Y; Z3 is an amino acid 
selected from the group consisting of N, Q, S, and T; Z4 is an amino acid selected 
from the group consisting of R, H, and K; Z5 is an amino acid selected from the group 
30 consisting of D and E; and Z6 is an amino acid selected from the group consisting of 

C, G, and P. 

125. The isolated or recombinant polypeptide of claim 1 19, wherein of the 

amino acid residues in the amino acid sequence that correspond to the following positions, 

at least 80% conform to the following restrictions: 
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(a) at position 9, 76, 94 and 1 10 the amino acid residue is A; 

(b) at position 29 and 108 the amino acid residue is C; 

(c) at position 34 the amino acid residue is D; 

(d) at position 95 the amino acid residue is E; 
5 (e) at position 56 the amino acid residue is F; 

(f) at position 43, 44, 66, 74, 87, 102, 116, 122, 127 and 136 the amino acid residue is G; 

(g) at position 41 the amino acid residue is H; 

(h) at position 7 the amino acid residue is I; 

(i) at position 85 the amino acid residue is K; 

10 (j) at position 20, 36, 42, 50, 72, 78, 98 and 121 the amino acid residue is L; 

(k) at position 1, 75 and 141 the amino acid residue is M; 

(1) at position 23, 64 and 109 the amino acid residue is N; 

(m) at position 22, 25, 133, 134 and 137 the amino acid residue is P; 

(n) at position 71 the amino acid residue is Q; 
15 (o) at position 16, 21, 73, 99 and 111 the amino acid residue is R; 

(p) at position 55 and 88 the amino acid residue is S; 

(q) at position 77 the amino acid residue is T; 

(r) at position 107 the amino acid residue is W; and 

(s) at position 13, 46, 70, 117 and 118 the amino acid residue is Y. 
20 126. The isolated or recombinant polypeptide of claim 24, wherein the 

amino acid residue in the amino acid sequence that correspond to position 28 is V. 

127. The isolated or recombinant polypeptide of claim 42, wherein the 

amino acid sequence is selected from the group consisting of SEQ ID NOS:6-10 and 263- 

514. 

25 128. A transgenic plant or transgenic plant explant having an enhanced 

tolerance to glyphosate, wherein the plant or plant explant expresses a polypeptide with 
glyphosate N-acetyltransferase activity and at least one polypeptide imparting glyphosate 
tolerance by an additional mechanism. 

129. The transgenic plant or transgenic plant explant of claim 128, 
30 wherein the polypeptide with glyphosate N-acetyltransferase activity comprises an amino 

acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-514. 

130. The transgenic plant or transgenic plant explant of claim 129, 

wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

mechanism is selected from the group consisting of a glyphosate-tolerant 5- 
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enolpynivylsWkimate-3-phosphate synthase and a glyphosate-tolerant glyphosate oxido- 
reductase. 

131. The transgenic plant or transgenic plant explant of claim 130, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

5 mechanism is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase. 

132. The transgenic plant or transgenic plant explant of claim 130, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
mechanism is a glyphosate-tolerant glyphosate oxido-reductase. 

133. A transgenic plant or transgenic plant explant, wherein the plant or 
10 plant explant expresses a polypeptide with glyphosate N-acetyltransferase activity and at 

least one polypeptide imparting tolerance to an additional herbicide. 

134. The transgenic plant or transgenic plant explant of claim 133, 
wherein the polypeptide with glyphosate N-acetyltransferase activity comprises an amino 
acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-514. 

15 135. The transgenic plant or transgenic plant explant of claim 134, 

wherein the at least one polypeptide imparting tolerance to an additional herbicide is 
selected from the group consisting of a mutated hydroxyphenylpyruvatedioxygenase, a 
sulfonamide-tolerant acetolactate synthase, a sulfonamide-tolerant acetohydroxy acid 
synthase, an imidazolinone-tolerant acetolactate synthase, an imidazolinone-tolerant 

20 acetohydroxy acid synthase, a phosphinothricin acetyl transferase and a mutated 
protoporphyrinogen oxidase. 

136. The transgenic plant or transgenic plant explant of claim 135, 
wherein the at least one polypeptide imparting tolerance to an additional herbicide is a 
mutated hydroxyphenylpyruvatedioxygenase. 

25 137. The transgenic plant or transgenic plant explant of claim 135, 

wherein the at least one polypeptide imparting tolerance to an additional herbicide is a 
sulfonamide-tolerant acetolactate synthase. 

138. The transgenic plant or transgenic plant explant of claim 135, 
wherein the at least one polypeptide imparting tolerance to an additional herbicide is a 

30 sulfonamide-tolerant acetohydroxy acid synthase. 

139. The transgenic plant or transgenic plant explant of claim 135, 
wherein the at least one polypeptide imparting tolerance to an additional herbicide is an 
imidazolinone-tolerant acetolactate synthase. 
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140. The transgenic plant or transgenic plant explant of claim 135, 
wherein the at least one polypeptide imparting tolerance to an additional herbicide is an 
imidazolinone-tolerant acetohydroxy acid synthase. 

141. The transgenic plant or transgenic plant explant of claim 135, 
5 wherein the at least one polypeptide imparting tolerance to an additional herbicide is a 

phosphinothricin acetyl transferase. 

142. The transgenic plant or transgenic plant explant of claim 135, 
wherein the at least one polypeptide imparting tolerance to an additional herbicide is a 
mutated protoporphyrinogen oxidase. 

10 143. A transgenic plant or transgenic plant explant having an enhanced 

tolerance to glyphosate, wherein the plant or plant explant expresses a polypeptide with 
glyphosate N-acetyltransferase activity, at least one polypeptide imparting glyphosate 
tolerance by an additional mechanism, and at least one polypeptide imparting tolerance to 
an additional herbicide. 

15 144. The transgenic plant or transgenic plant explant of claim 143, 

wherein the polypeptide with glyphosate N-acetyltransferase activity comprises an amino 
acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-514. 

145. The transgenic plant or transgenic plant explant of claim 144, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

20 mechanism is selected from the group consisting of a glyphosate-tolerant 5- 
enolpyruvylshikimate-3-phosphate synthase and a glyphosate-tolerant glyphosate oxido- 
reductase and the at least one polypeptide imparting tolerance to an additional herbicide is 
selected from the group consisting of a mutated hydroxyphenylpyravatedioxygenase, a 
sulfonamide-tolerant acetolactate synthase, a sulfonamide-tolerant acetohydroxy acid 

25 synthase, an imidazolinone-tolerant acetolactate synthase, an imidazolinone-tolerant 
acetohydroxy acid synthase, a phosphinothricin acetyl transferase and a mutated 
protoporphyrinogen oxidase. 

146. The transgenic plant or transgenic plant explant of claim 145, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

30 mechanism is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase and the 
at least one polypeptide imparting tolerance to an additional herbicide is a mutated 
hydroxyphenylpyruvatedioxygenase. 

147. The transgenic plant or transgenic plant explant of claim 145, 

wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
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mechanism is a glyphosate-tolerant 5-enolpymvylshiltimate~3-phosphate synthase and the 
at least one polypeptide imparting tolerance to an additional herbicide is a sulfonamide- 
tolerant acetolactate synthase. 

148. The transgenic plant or transgenic plant explant of claim 145, 
5 wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

mechanism is a glyphosate-tolerant 5-enolpyruvylshiMmate-3-phosphate synthase and the 
at least one polypeptide imparting tolerance to an additional herbicide is a sulfonamide- 
tolerant acetohydroxy acid synthase. 

149. The transgenic plant or transgenic plant explant of claim 145, 
10 wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

mechanism is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase and the 
at least one polypeptide imparting tolerance to an additional herbicide is an imidazolinone- 
tolerant acetolactate synthase. 

150. The transgenic plant or transgenic plant explant of claim 145, 
15 wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

mechanism is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase and the 
at least one polypeptide imparting tolerance to an additional herbicide is an imidazolinone- 
tolerant acetohydroxy acid synthase. 

151. The transgenic plant or transgenic plant explant of claim 145, 
20 wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

mechanism is a glyphosate-tolerant 5-enolpymvylshikimate-3-phosphate synthase and the 
at least one polypeptide imparting tolerance to an additional herbicide is a 
phosphinothricin acetyl transferase. 

152. The transgenic plant or transgenic plant explant of claim 145, 
25 wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

mechanism is a glyphosate-tolerant 5~enolpyruvylshikimate-3-phosphate synthase and the 
at least one polypeptide imparting tolerance to an additional herbicide is a mutated 
protoporphyrinogen oxidase. 

153. The transgenic plant or transgenic plant explant of claim 145, 
30 wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

mechanism is a glyphosate-tolerant glyphosate oxido-reductase and the at least one 
polypeptide imparting tolerance to an additional herbicide is a mutated 
hydroxyphenylpyruvatedioxygenase. 
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154. The transgenic plant or transgenic plant explant of claim 145, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
mechanism is a glyphosate-tolerant glyphosate oxido-reductase and the at least one 
polypeptide imparting tolerance to an additional herbicide is a sulfonamide-tolerant 

5 acetolactate synthase. 

155. The transgenic plant or transgenic plant explant of claim 145, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
mechanism is a glyphosate-tolerant glyphosate oxido-reductase and the at least one 
polypeptide imparting tolerance to an additional herbicide is a sulfonamide-tolerant 

10 acetohydroxy acid synthase. 

156. The transgenic plant or transgenic plant explant of claim 145, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
mechanism is a glyphosate-tolerant glyphosate oxido-reductase and the at least one 
polypeptide imparting tolerance to an additional herbicide is an imidazolinone-tolerant 

15 acetolactate synthase. 

157. The transgenic plant or transgenic plant explant of claim 145, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
mechanism is a glyphosate-tolerant glyphosate oxido-reductase and the at least one 
polypeptide imparting tolerance to an additional herbicide is an imidazolinone-tolerant 

20 acetohydroxy acid synthase. 

158. The transgenic plant or transgenic plant explant of claim 145, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
mechanism is a glyphosate-tolerant glyphosate oxido-reductase and the at least one 
polypeptide imparting tolerance to an additional herbicide is a phosphinothricin acetyl 

25 transferase. 

159. The transgenic plant or transgenic plant explant of claim 145, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
mechanism is a glyphosate-tolerant glyphosate oxido-reductase and the at least one 
polypeptide imparting tolerance to an additional herbicide is a mutated 

30 protoporphyrinogen oxidase. 

160. A transgenic plant or transgenic plant explant having an enhanced 
tolerance to glyphosate, wherein the plant or plant explant expresses a polypeptide with 
glyphosate N-acetyltransferase activity and at least one of a polypeptide selected from the 

-230- 



WO 02/36782 



PCT/USO 1/46227 



group consisting of a glyphosate-tolerant 5-enolpymvylshikimate-3-phosphate synthase 
and a glyphosate-tolerant glyphosate oxido-reductase. 

161. The transgenic plant or transgenic plant explant of claim 160, 
wherein the polypeptide with glyphosate N-acetyltransferase activity comprises an amino 

5 acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-5 14. 

162. The transgenic plant or transgenic plant explant of claim 161, 
wherein the at least one polypeptide is a glyphosate-tolerant 5-enolpyruvylshikimate-3- 
phosphate synthase. 

163. The transgenic plant or transgenic plant explant of claim 161, 
10 wherein the at least one polypeptide is a glyphosate-tolerant glyphosate oxido-reductase. 

164. A transgenic plant or transgenic plant explant, wherein the plant or 
plant explant expresses a polypeptide with glyphosate N-acetyltransferase activity and at 
least one polypeptide selected from the group consisting of a mutated 
hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate synthase, a 

15 sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant acetolactate 
synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a phosphinothricin acetyl 
transferase and a mutated protoporphyrinogen oxidase. 

165. The transgenic plant or transgenic plant explant of claim 164, 
wherein the polypeptide with glyphosate N-acetyltransferase activity comprises an amino 

20 acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-514. 

166. The transgenic plant or transgenic plant explant of claim 165, 
wherein the at least one polypeptide is a mutated hydroxyphenylpyruvatedioxygenase. 

167. The transgenic plant or transgenic plant explant of claim 165, 
wherein the at least one polypeptide is a sulfonamide-tolerant acetolactate synthase. 

25 168. The transgenic plant or transgenic plant explant of claim 165, 

wherein the at least one polypeptide is a sulfonamide-tolerant acetohydroxy acid synthase. 

169. The transgenic plant or transgenic plant explant of claim 165, 
wherein the at least one polypeptide is an imidazolinone-tolerant acetolactate synthase. 

170. The transgenic plant or transgenic plant explant of claim 165, 
30 wherein the at least one polypeptide is an imidazolinone-tolerant acetohydroxy acid 

synthase. 

171. The transgenic plant or transgenic plant explant of claim 165, 
wherein the at least one polypeptide is a phosphinothricin acetyl transferase. 
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172. The transgenic plant or transgenic plant explant of claim 165, 
wherein the at least one polypeptide is a mutated protoporphyrinogen oxidase. 

173. A transgenic plant or transgenic plant explant having an enhanced 
tolerance to glyphosate, wherein the plant or plant explant expresses a polypeptide with 
glyphosate N-acetyltransferase activity, at least one of a first polypeptide selected from the 
group consisting of a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase 
and a glyphosate-tolerant glyphosate oxido-reductase and at least one of a second 
polypeptide selected from the group consisting of a mutated 
hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate synthase, a 
sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant acetolactate 
synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a phosphinothricin acetyl 
transferase and a mutated protoporphyrinogen oxidase. 

174. The transgenic plant or transgenic plant explant of claim 173, 
wherein the polypeptide with glyphosate N-acetyltransferase activity comprises an amino 
acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-514. 

175. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate 
synthase and the second polypeptide is a mutated hydroxyphenylpyruvatedioxygenase. 

176. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate 
synthase and the second polypeptide is a sulfonamide-tolerant acetolactate synthase. 

177. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate 
synthase and the second polypeptide is a sulfonamide-tolerant acetohydroxy acid synthase. 

178. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant 5-enolpymvylshikimate-3-phosphate 
synthase and the second polypeptide is an imidazolinone-tolerant acetolactate synthase. 

179. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate 
synthase and the second polypeptide is an imidazolinone-tolerant acetohydroxy acid 
synthase. 

180. The transgenic plant or transgenic plant explant of claim 174, 

wherein the first polypeptide is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate 

synthase and the second polypeptide is a phosphinothricin acetyl transferase. 
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181. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate 
synthase and the second polypeptide is a mutated protoporphyrinogen oxidase. 

182. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant glyphosate oxido-reductase and the 
second polypeptide is a mutated hydroxyphenylpyruvatedioxygenase. 

183. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant glyphosate oxido-reductase and the 
second polypeptide is a sulfonamide-tolerant acetolactate synthase. 

184. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant glyphosate oxido-reductase and the 
second polypeptide is a sulfonamide-tolerant acetohydroxy acid synthase. 

185. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant glyphosate oxido-reductase and the 
second polypeptide is an imidazolinone-tolerant acetolactate synthase. 

186. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant glyphosate oxido-reductase and the 
second polypeptide is an imidazolinone-tolerant acetohydroxy acid synthase. 

187. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant glyphosate oxido-reductase and the 
second polypeptide is a phosphinothricin acetyl transferase. 

188. The transgenic plant or transgenic plant explant of claim 174, 
wherein the first polypeptide is a glyphosate-tolerant glyphosate oxido-reductase and the 
second polypeptide is a mutated protoporphyrinogen oxidase. 

189. A transgenic plant or transgenic plant explant having an enhanced 
tolerance to glyphosate, wherein the plant or plant explant expresses a polypeptide with 
glyphosate N-acetyltransferase activity and at least one polypeptide selected from the 
group consisting of a glyphosate-tolerant 5-enolpymvylshikimate-3-phosphate synthase, a 
glyphosate-tolerant glyphosate oxido-reductase, a mutated 
hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate synthase, a 
sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant acetolactate 
synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a phosphinothricin acetyl 
transferase and a mutated protoporphyrinogen oxidase. 
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190. The transgenic plant or transgenic plant explant of claim 189, 
wherein the polypeptide with glyphosate N-acetyltransferase activity comprises an amino 
acid sequence selected from the group consisting of SEQ ID NOS: 6-10 and 263-514. 

191. The transgenic plant or transgenic plant explant of claim 190, 
5 wherein the polypeptide is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate 

synthase. 

192. The transgenic plant or transgenic plant explant of claim 190, 
wherein the polypeptide is a glyphosate-tolerant glyphosate oxido-reductase. 

193. The transgenic plant or transgenic plant explant of claim 190, 
10 wherein the polypeptide is a mutated hydroxyphenylpyruvatedioxygenase. 

194. The transgenic plant or transgenic plant explant of claim 190, 
wherein the polypeptide is a sulfonamide-tolerant acetolactate synthase. 

195. The transgenic plant or transgenic plant explant of claim 190, 
wherein the polypeptide is a sulfonamide-tolerant acetohydroxy acid synthase. 

15 196. The transgenic plant or transgenic plant explant of claim 190, 

wherein the polypeptide is an imidazolinone-tolerant acetolactate synthase. 

197. The transgenic plant or transgenic plant explant of claim 190, 
wherein the polypeptide is an imidazolinone-tolerant acetohydroxy acid synthase. 

198. The transgenic plant or transgenic plant explant of claim 190, 
20 wherein the polypeptide is a phosphinothricin acetyl transferase. 

199. The transgenic plant or transgenic plant explant of claim 190, 
wherein the polypeptide is a mutated protoporphyrinogen oxidase. 

200. A method for controlling weeds in a field containing a crop 

comprising: 

25 (a) planting the field with crop seeds or plants which are transformed with a gene 
encoding a glyphosate N-acetyltransferase and at least one gene encoding a 
polypeptide imparting glyphosate tolerance by an additional mechanism; and 
(b) applying to the crop and weeds in the field an effective application of glyphosate 
sufficient to inhibit growth of the weeds in the field without significantly affecting the 

30 crop. 

201. The method of claim 200, wherein the gene encoding a glyphosate 
N-acetyltransferase comprises a polynucleotide sequence selected from the group 
consisting of SEQ ID NOS: 1-5 and 11-262. 
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202. The method of claim 201, wherein the polypeptide imparting 
glyphosate tolerance by an additional mechanism is selected from the group consisting of 
a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase and a glyphosate- 
tolerant glyphosate oxido-reductase. 

203. The method of claim 202, wherein the polypeptide imparting 
glyphosate tolerance by an additional mechanism is a glyphosate-tolerant 5- 
enolpyruvylshikimate-3-phosphate synthase. 

204. The method of claim 202, wherein the polypeptide imparting 
glyphosate tolerance by an additional mechanism is a glyphosate-tolerant glyphosate 
oxido-reductase. 

205. A method for preventing emergence of glyphosate resistant weeds 
in a field containing a crop comprising: 

(a) planting the field with crop seeds or plants which are transformed with a gene 
encoding a glyphosate N-acetyltransferase and at least one gene encoding a 
polypeptide imparting glyphosate tolerance by an additional mechanism; and 

(b) applying to the crop and weeds in the field an effective application of glyphosate. 

206. The method of claim 205, wherein the gene encoding a glyphosate 
N-acetyltransferase comprises a polynucleotide sequence selected from the group 
consisting of SEQ ID NOS: 1-5 and 1 1-262. 

207. The method of claim 206, wherein the polypeptide imparting 
glyphosate tolerance by an additional mechanism is selected from the group consisting of 
a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase and a glyphosate- 
tolerant glyphosate oxido-reductase. 

208. The method of claim 207, wherein the polypeptide imparting 
glyphosate tolerance by an additional mechanism is a glyphosate-tolerant 5- 
enolpyruvylshikimate-3-phosphate synthase. 

209. The method of claim 207, wherein the polypeptide imparting 
glyphosate tolerance by an additional mechanism is a glyphosate-tolerant glyphosate 
oxido-reductase. 

210. A method for selectively controlling weeds in a field containing a 
crop comprising: 

(a) planting the field with crop seeds or plants which are transformed with a gene 

encoding .a glyphosate N-acetyltransferase and at least one gene encoding a 

polypeptide imparting tolerance to an additional herbicide, and; 
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(b) applying to the crop and weeds in the field a simultaneous or chronologically 

staggered application of glyphosate and the additional herbicide which is sufficient to 
inhibit growth of the weeds in the field without significantly affecting the crop. 

21 L The method of claim 210, wherein the gene encoding a glyphosate 

N-acetyltransferase comprises a polynucleotide sequence selected from the group 

consisting of SEQ ID NOS: 1-5 and 11-262. 

212. The method of claim 211, wherein the at least one polypeptide 
imparting tolerance to an additional herbicide is selected from the group consisting of a 
mutated hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate 
synthase, a sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant 
acetolactate synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a 
phosphinothricin acetyl transferase and a mutated protoporphyrinogen oxidase. 

213. The method of claim 211, wherein the additional herbicide is 
selected from the group consisting of a hydroxyphenylpyruvatedioxygenase inhibitor, 
sulfonamide, imidazolinone, bialaphos, phosphinothricin, azafenidin, butafenacil, 
sulfosate, glufosinate, and a protox inhibitor. 

214. The method of claim 212, wherein the polypeptide imparting 
tolerance to an additional herbicide is a mutated hydroxyphenylpyruvatedioxygenase. 

215. The method of claim 212, wherein the polypeptide imparting 
tolerance to an additional herbicide is a sulfonamide-tolerant acetolactate synthase. 

216. The method of claim 212, wherein the polypeptide imparting 
tolerance to an additional herbicide is a sulfonamide-tolerant acetohydroxy acid synthase. 

217. The method of claim 212, wherein the polypeptide imparting 
tolerance to an additional herbicide is an imidazolinone-tolerant acetolactate synthase. 

218. The method of claim 212, wherein the polypeptide imparting 
tolerance to an additional herbicide is an imidazolinone-tolerant acetohydroxy acid 
synthase. 

219. The method of claim 212, wherein the polypeptide imparting 
tolerance to an additional herbicide is a phosphinothricin acetyl transferase. 

220. The method of claim 212, wherein the polypeptide imparting 
tolerance to an additional herbicide is a mutated protoporphyrinogen oxidase. 

221 . A method for preventing emergence of herbicide resistant weeds in 
a field containing a crop comprising: 
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(a) planting the field with crop seeds or plants which are transformed with a gene 
encoding a glyphosate N-acetyltransferase and at least one gene encoding a 
polypeptide imparting tolerance to an additional herbicide, and; 

(b) applying to the crop and weeds in the field a simultaneous or chronologically 
staggered application of glyphosate and the additional herbicide. 

222. The method of claim 221, wherein the gene encoding a glyphosate 
N-acetyltransferase comprises a polynucleotide sequence selected from the group 
consisting of SEQ ID NOS: 1-5 and 11-262. 

223. The method of claim 222, wherein the at least one polypeptide 
imparting tolerance to an additional herbicide is selected from the group consisting of a 
mutated hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate 
synthase, a sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant 
acetolactate synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a 
phosphinothricin acetyl transferase and a mutated protoporphyrinogen oxidase. 

224. The method of claim 221 , wherein the additional herbicide is 
selected from the group consisting of a hydroxyphenylpyruvatedioxygenase inhibitor, 
sulfonamide, imidazolinone, bialaphos, phosphinothricin, azafenidin, butafenacil, 
sulfosate, glufosinate, and a protox inhibitor. 

225. The method of claim 223, wherein the polypeptide imparting 
tolerance to an additional herbicide is a mutated hydroxyphenylpyruvatedioxygenase. 

226. The method of claim 223, wherein the polypeptide imparting 
tolerance to an additional herbicide is a sulfonamide-tolerant acetolactate synthase. 

227. The method of claim 223, wherein the polypeptide imparting 
tolerance to an additional herbicide is a sulfonamide-tolerant acetohydroxy acid synthase. 

228. The method of claim 223, wherein the polypeptide imparting 
tolerance to an additional herbicide is an imidazolinone-tolerant acetolactate synthase. 

229. The method of claim 223, wherein the polypeptide imparting 
tolerance to an additional herbicide is an imidazolinone-tolerant acetohydroxy acid 
synthase. 

230. The method of claim 223, wherein the polypeptide imparting 
tolerance to an additional herbicide is a phosphinothricin acetyl transferase. 

231 . The method of claim 223, wherein the polypeptide imparting 
tolerance to an additional herbicide is a mutated protoporphyrinogen oxidase. 
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232. A method for selectively controlling weeds in a field containing a 
crop comprising: 

(a) planting the field with crop seeds or plants which are transformed with a gene 
encoding a glyphosate N-acetyltransferase, at least one gene encoding a polypeptide 

5 imparting glyphosate tolerance by an additional mechanism and at least one gene 

encoding a polypeptide imparting tolerance to an additional herbicide, and; 

(b) applying to the crop and weeds in the field a simultaneous or chronologically 
staggered application of glyphosate and the additional herbicide which is sufficient to 
inhibit growth of the weeds in the field without significantly affecting the crop. 

10 233. The method of claim 232, wherein the gene encoding a glyphosate 

N-acetyltransferase comprises a polynucleotide sequence selected from the group 
consisting of SEQ ID NOS: 1-5 and 11-262. 

234. The transgenic plant or transgenic plant explant of claim 233, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

15 mechanism is selected from the group consisting of a glyphosate-tolerant 5- 
enolpyruvylshikimate-3-phosphate synthase and a glyphosate-tolerant glyphosate oxido- 
reductase. 

235. The transgenic plant or transgenic plant explant of claim 234, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 

20 mechanism is a glyphosate-tolerant 5-enolpyruvylshikimate-3-phosphate synthase. 

236. The transgenic plant or transgenic plant explant of claim 234, 
wherein the at least one polypeptide imparting glyphosate tolerance by an additional 
mechanism is a glyphosate-tolerant glyphosate oxido-reductase. 

237. The method of claim 233, wherein the at least one polypeptide 
25 imparting tolerance to an additional herbicide is selected from the group consisting of a 

mutated hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate 
synthase, a sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant 
acetolactate synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a 
phosphinothricin acetyl transferase and a mutated protoporphyrinogen oxidase. 
30 238. The method of claim 233, wherein the additional herbicide is 

selected from the group consisting of a hydroxyphenylpyruvatedioxygenase inhibitor, 
sulfonamide, imidazolinone, bialaphos, phosphinothricin, azafenidin, butafenacil, 
sulfosate, glufosinate, and a protox inhibitor. 
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239. The method of claim 237, wherein the polypeptide imparting 
tolerance to an additional herbicide is a mutated hydroxyphenylpyruvatedioxygenase. 

240. The method of claim 237, wherein the polypeptide imparting 
tolerance to an additional herbicide is a sulfonamide-tolerant acetolactate synthase. 

241. The method of claim 237, wherein the polypeptide imparting 
tolerance to an additional herbicide is a sulfonamide-tolerant acetohydroxy acid synthase. 

242. The method of claim 237, wherein the polypeptide imparting 
tolerance to an additional herbicide is an imidazolinone-tolerant acetolactate synthase. 

243. The method of claim 237, wherein the polypeptide imparting 
tolerance to an additional herbicide is an imidazolinone-tolerant acetohydroxy acid 
synthase. 

244. The method of claim 237, wherein the polypeptide imparting 
tolerance to an additional herbicide is a phosphinothricin acetyl transferase. 

245. The method of claim 237, wherein the polypeptide imparting 
tolerance to an additional herbicide is a mutated protoporphyrinogen oxidase. 

246. A method for preventing emergence of herbicide resistant weeds in 
a field containing a crop comprising: 

(a) planting the field with crop seeds or plants which are transformed with a gene 
encoding a glyphosate N-acetyltransferase, at least one gene encoding a polypeptide 
imparting glyphosate tolerance by an additional mechanism and at least one gene 
encoding a polypeptide imparting tolerance to an additional herbicide, and; 

(b) applying to the crop and weeds in the field a simultaneous or chronologically 
staggered application of glyphosate and the additional herbicide. 

247. The method of claim 246, wherein the gene encoding a glyphosate 
N-acetyltransferase comprises a polynucleotide sequence selected from the group 
consisting of SEQ ID NOS: 1-5 and 11-262. 

248. The method of claim 247, wherein the at least one polypeptide 
imparting tolerance to an additional herbicide is selected from the group consisting of a 
mutated hydroxyphenylpyruvatedioxygenase, a sulfonamide-tolerant acetolactate 
synthase, a sulfonamide-tolerant acetohydroxy acid synthase, an imidazolinone-tolerant 
acetolactate synthase, an imidazolinone-tolerant acetohydroxy acid synthase, a 
phosphinothricin acetyl transferase and a mutated protoporphyrinogen oxidase. 

249. The method of claim 247, wherein the additional herbicide is 

selected from the group consisting of a hydroxyphenylpyruvatedioxygenase inhibitor, 
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sulfonamide, imidazolinone, bialaphos, phosphinothricin, azafenidin, butafenacil, 
sulfosate, glufosinate, and a protox inhibitor. 

250. The method of claim 248, wherein the polypeptide imparting 
tolerance to an additional herbicide is a mutated hydroxyphenylpyruvatedioxygenase. 

25 1 . The method of claim 248, wherein the polypeptide imparting 
tolerance to an additional herbicide is a sulfonamide-tolerant acetolactate synthase. 

252. The method of claim 248, wherein the polypeptide imparting 
tolerance to an additional herbicide is a sulfonamide-tolerant acetohydroxy acid synthase. 

253. The method of claim 248, wherein the polypeptide imparting 
tolerance to an additional herbicide is an imidazolinone-tolerant acetolactate synthase. 

254. The method of claim 248, wherein the polypeptide imparting 
tolerance to an additional herbicide is an imidazolinone-tolerant acetohydroxy acid 
synthase. 

255. The method of claim 248, wherein the polypeptide imparting 
tolerance to an additional herbicide is a phosphinothricin acetyl transferase. 

256. The method of claim 248, wherein the polypeptide imparting 
tolerance to an additional herbicide is a mutated protoporphyrinogen oxidase. 
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